Shotgun metagenome data of a defined mock community using Oxford Nanopore, PacBio and Illumina technologies

Metagenomic sequence data from defined mock communities is crucial for the assessment of sequencing platform performance and downstream analyses, including assembly, binning and taxonomic assignment. We report a comparison of shotgun metagenome sequencing and assembly metrics of a defined microbial...

Full description

Saved in:
Bibliographic Details
Published inScientific data Vol. 6; no. 1; pp. 285 - 9
Main Authors Sevim, Volkan, Lee, Juna, Egan, Robert, Clum, Alicia, Hundley, Hope, Lee, Janey, Everroad, R. Craig, Detweiler, Angela M., Bebout, Brad M., Pett-Ridge, Jennifer, Göker, Markus, Murray, Alison E., Lindemann, Stephen R., Klenk, Hans-Peter, O’Malley, Ronan, Zane, Matthew, Cheng, Jan-Fang, Copeland, Alex, Daum, Christopher, Singer, Esther, Woyke, Tanja
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 26.11.2019
Nature Publishing Group
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Metagenomic sequence data from defined mock communities is crucial for the assessment of sequencing platform performance and downstream analyses, including assembly, binning and taxonomic assignment. We report a comparison of shotgun metagenome sequencing and assembly metrics of a defined microbial mock community using the Oxford Nanopore Technologies (ONT) MinION, PacBio and Illumina sequencing platforms. Our synthetic microbial community BMock12 consists of 12 bacterial strains with genome sizes spanning 3.2–7.2 Mbp, 40–73% GC content, and 1.5–7.3% repeats. Size selection of both PacBio and ONT sequencing libraries prior to sequencing was essential to yield comparable relative abundances of organisms among all sequencing technologies. While the Illumina-based metagenome assembly yielded good coverage with few misassemblies, contiguity was greatly improved by both, Illumina + ONT and Illumina + PacBio hybrid assemblies but increased misassemblies, most notably in genomes with high sequence similarity to each other. Our resulting datasets allow evaluation and benchmarking of bioinformatics software on Illumina, PacBio and ONT platforms in parallel. Measurement(s) metagenomic data • sequence_assembly Technology Type(s) ONT MinION • Illumina sequencing • PacBio RS II Factor Type(s) sequencing platform Sample Characteristic - Organism Bacteria Sample Characteristic - Environment mock community Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.10260740
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Undefined-3
AC02-05CH11231; AC52-07NA27344
USDOE Office of Science (SC)
ISSN:2052-4463
2052-4463
DOI:10.1038/s41597-019-0287-z