A geographically-diverse collection of 418 human gut microbiome pathway genome databases

Advances in high-throughput sequencing are reshaping how we perceive microbial communities inhabiting the human body, with implications for therapeutic interventions. Several large-scale datasets derived from hundreds of human microbiome samples sourced from multiple studies are now publicly availab...

Full description

Saved in:
Bibliographic Details
Published inScientific data Vol. 4; no. 1; p. 170035
Main Authors Hahn, Aria S, Altman, Tomer, Konwar, Kishori M, Hanson, Niels W, Kim, Dongjae, Relman, David A, Dill, David L, Hallam, Steven J
Format Journal Article
LanguageEnglish
Published England Nature Publishing Group 11.04.2017
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Advances in high-throughput sequencing are reshaping how we perceive microbial communities inhabiting the human body, with implications for therapeutic interventions. Several large-scale datasets derived from hundreds of human microbiome samples sourced from multiple studies are now publicly available. However, idiosyncratic data processing methods between studies introduce systematic differences that confound comparative analyses. To overcome these challenges, we developed GutCyc, a compendium of environmental pathway genome databases (ePGDBs) constructed from 418 assembled human microbiome datasets using MetaPathways, enabling reproducible functional metagenomic annotation. We also generated metabolic network reconstructions for each metagenome using the Pathway Tools software, empowering researchers and clinicians interested in visualizing and interpreting metabolic pathways encoded by the human gut microbiome. For the first time, GutCyc provides consistent annotations and metabolic pathway predictions, making possible comparative community analyses between health and disease states in inflammatory bowel disease, Crohn's disease, and type 2 diabetes. GutCyc data products are searchable online, or may be downloaded and explored locally using MetaPathways and Pathway Tools.
Bibliography:These authors contributed equally to this work.
T.A. and S.J.H. conceived of the GutCyc project as part of a movement to develop the Environmental Genome Encyclopedia (EngCyc): a compendium of microbial community metabolic blueprints supported by high performance software tools on grids and clouds. N.W.H., K.M.K., A.S.H. and D.K. developed the MetaPathways software pipeline with direction from S.J.H. and assistance from T.A. and others at SRI International. A.S.H. and K.M.K. compiled the microbiome sequence datasets, constructed GutCyc ePGDBs and created figures for the manuscript. T.A. generated validation datasets and drafted an early version of the manuscript with A.S.H. and S.J.H. D.K. developed the GutCyc website. All authors contributed to the final preparation of the manuscript. S.J.H., D.L.D. and D.A.R. supervised the project. All authors reviewed and approved the final manuscript.
ISSN:2052-4463
2052-4463
DOI:10.1038/sdata.2017.35