hybridSPAdes: an algorithm for hybrid assembly of short and long reads

Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologie...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics Vol. 32; no. 7; pp. 1009 - 1015
Main Authors Antipov, Dmitry, Korobeynikov, Anton, McLean, Jeffrey S, Pevzner, Pavel A
Format Journal Article
LanguageEnglish
Published England Oxford University Press 01.04.2016
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost. We describe hybridSPAdes algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that hybridSPAdes generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads. hybridSPAdes is implemented in C++ as a part of SPAdes genome assembler and is publicly available at http://bioinf.spbau.ru/en/spades d.antipov@spbu.ru supplementary data are available at Bioinformatics online.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Associate Editor: Inanc Birol
ISSN:1367-4803
1367-4811
1367-4811
1460-2059
DOI:10.1093/bioinformatics/btv688