A geographically-diverse collection of 418 human gut microbiome pathway genome databases
Advances in high-throughput sequencing are reshaping how we perceive microbial communities inhabiting the human body, with implications for therapeutic interventions. Several large-scale datasets derived from hundreds of human microbiome samples sourced from multiple studies are now publicly availab...
Saved in:
Published in | Scientific data Vol. 4; no. 1; p. 170035 |
---|---|
Main Authors | , , , , , , , |
Format | Journal Article |
Language | English |
Published |
England
Nature Publishing Group
11.04.2017
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Advances in high-throughput sequencing are reshaping how we perceive microbial communities inhabiting the human body, with implications for therapeutic interventions. Several large-scale datasets derived from hundreds of human microbiome samples sourced from multiple studies are now publicly available. However, idiosyncratic data processing methods between studies introduce systematic differences that confound comparative analyses. To overcome these challenges, we developed GutCyc, a compendium of environmental pathway genome databases (ePGDBs) constructed from 418 assembled human microbiome datasets using MetaPathways, enabling reproducible functional metagenomic annotation. We also generated metabolic network reconstructions for each metagenome using the Pathway Tools software, empowering researchers and clinicians interested in visualizing and interpreting metabolic pathways encoded by the human gut microbiome. For the first time, GutCyc provides consistent annotations and metabolic pathway predictions, making possible comparative community analyses between health and disease states in inflammatory bowel disease, Crohn's disease, and type 2 diabetes. GutCyc data products are searchable online, or may be downloaded and explored locally using MetaPathways and Pathway Tools. |
---|---|
Bibliography: | These authors contributed equally to this work. T.A. and S.J.H. conceived of the GutCyc project as part of a movement to develop the Environmental Genome Encyclopedia (EngCyc): a compendium of microbial community metabolic blueprints supported by high performance software tools on grids and clouds. N.W.H., K.M.K., A.S.H. and D.K. developed the MetaPathways software pipeline with direction from S.J.H. and assistance from T.A. and others at SRI International. A.S.H. and K.M.K. compiled the microbiome sequence datasets, constructed GutCyc ePGDBs and created figures for the manuscript. T.A. generated validation datasets and drafted an early version of the manuscript with A.S.H. and S.J.H. D.K. developed the GutCyc website. All authors contributed to the final preparation of the manuscript. S.J.H., D.L.D. and D.A.R. supervised the project. All authors reviewed and approved the final manuscript. |
ISSN: | 2052-4463 2052-4463 |
DOI: | 10.1038/sdata.2017.35 |