PacBio Long Reads Improve Metagenomic Assemblies, Gene Catalogs, and Genome Binning
PacBio long reads sequencing presents several potential advantages for DNA assembly, including being able to provide more complete gene profiling of metagenomic samples. However, lower single-pass accuracy can make gene discovery and assembly for low-abundance organisms difficult. To evaluate the ap...
Saved in:
Published in | Frontiers in genetics Vol. 11; p. 516269 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
Frontiers Media S.A
08.09.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | PacBio long reads sequencing presents several potential advantages for DNA assembly, including being able to provide more complete gene profiling of metagenomic samples. However, lower single-pass accuracy can make gene discovery and assembly for low-abundance organisms difficult. To evaluate the application and performance of PacBio long reads and Illumina HiSeq short reads in metagenomic analyses, we directly compared various assemblies involving PacBio and Illumina sequencing reads based on two anaerobic digestion microbiome samples from a biogas fermenter. Using a PacBio platform, 1.58 million long reads (19.6 Gb) were produced with an average length of 7,604 bp. Using an Illumina HiSeq platform, 151.2 million read pairs (45.4 Gb) were produced. Hybrid assemblies using PacBio long reads and HiSeq contigs produced improvements in assembly statistics, including an increase in the average contig length, contig N50 size, and number of large contigs. Interestingly, depth-based hybrid assemblies generated a higher percentage of complete genes (98.86%) compared to those based on HiSeq contigs only (40.29%), because the PacBio reads were long enough to cover many repeating short elements and capture multiple genes in a single read. Additionally, the incorporation of PacBio long reads led to considerable advantages regarding reducing contig numbers and increasing the completeness of the genome reconstruction, which was poorly assembled and binned when using HiSeq data alone. From this comparison of PacBio long reads with Illumina HiSeq short reads related to complex microbiome samples, we conclude that PacBio long reads can produce longer contigs, more complete genes, and better genome binning, thereby offering more information about metagenomic samples. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Edited by: Barbara J. Campbell, Clemson University, United States Reviewed by: Xiyin Wang, North China University of Science and Technology, China; Wei Xu, Texas A&M University Corpus Christi, United States These authors have contributed equally to this work This article was submitted to Genomic Assay Technology, a section of the journal Frontiers in Genetics |
ISSN: | 1664-8021 1664-8021 |
DOI: | 10.3389/fgene.2020.516269 |