hybridSPAdes: an algorithm for hybrid assembly of short and long reads
Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologie...
Saved in:
Published in | Bioinformatics Vol. 32; no. 7; pp. 1009 - 1015 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
England
Oxford University Press
01.04.2016
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost.
We describe hybridSPAdes algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that hybridSPAdes generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads.
hybridSPAdes is implemented in C++ as a part of SPAdes genome assembler and is publicly available at http://bioinf.spbau.ru/en/spades
d.antipov@spbu.ru
supplementary data are available at Bioinformatics online. |
---|---|
AbstractList | Motivation: Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost.Results: We describe hybridSPAdes algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that hybridSPAdes generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads.Availability and implementation: hybridSPAdes is implemented in C++as a part of SPAdes genome assembler and is publicly available at http://bioinf.spbau.ru/en/spades Supplementary information: supplementary data are available at Bioinformatics online. Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost. We describe hybridSPAdes algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that hybridSPAdes generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads. hybridSPAdes is implemented in C++ as a part of SPAdes genome assembler and is publicly available at http://bioinf.spbau.ru/en/spades d.antipov@spbu.ru supplementary data are available at Bioinformatics online. Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost.MOTIVATIONRecent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost.We describe hybridSPAdes algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that hybridSPAdes generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads.RESULTSWe describe hybridSPAdes algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that hybridSPAdes generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads.hybridSPAdes is implemented in C++ as a part of SPAdes genome assembler and is publicly available at http://bioinf.spbau.ru/en/spadesAVAILABILITY AND IMPLEMENTATIONhybridSPAdes is implemented in C++ as a part of SPAdes genome assembler and is publicly available at http://bioinf.spbau.ru/en/spadesd.antipov@spbu.ruCONTACTd.antipov@spbu.rusupplementary data are available at Bioinformatics online.SUPPLEMENTARY INFORMATIONsupplementary data are available at Bioinformatics online. Motivation: Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost. Results: We describe hybrid SPA des algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that hybrid SPA des generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads. Availability and implementation: hybrid SPA des is implemented in C++ as a part of SPAdes genome assembler and is publicly available at http://bioinf.spbau.ru/en/spades Contact: d.antipov@spbu.ru Supplementary information: supplementary data are available at Bioinformatics online. |
Author | Antipov, Dmitry Pevzner, Pavel A McLean, Jeffrey S Korobeynikov, Anton |
Author_xml | – sequence: 1 givenname: Dmitry surname: Antipov fullname: Antipov, Dmitry organization: Center for Algorithmic Biotechnology, Institute for Translational Biomedicine – sequence: 2 givenname: Anton surname: Korobeynikov fullname: Korobeynikov, Anton organization: Center for Algorithmic Biotechnology, Institute for Translational Biomedicine, Department of Statistical Modelling, St. Petersburg State University, St. Petersburg, Russia – sequence: 3 givenname: Jeffrey S surname: McLean fullname: McLean, Jeffrey S organization: Department of Periodontics, University of Washington, Seattle, WA 98195, USA – sequence: 4 givenname: Pavel A surname: Pevzner fullname: Pevzner, Pavel A organization: Center for Algorithmic Biotechnology, Institute for Translational Biomedicine, Department of Computer Science and Engineering, University of California, San Diego, USA and |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/26589280$$D View this record in MEDLINE/PubMed |
BookMark | eNqNkV1LwzAUhoNM3If-BCWX3tQlTdIkXggynAoDBfW6JE26ZrTNTLrB_r2FzaFXenUOvA8P5_COwaD1rQXgEqMbjCSZauddW_rQqM4Vcaq7bSbECRhhkvGECowHxx2RIRjHuEIIMcSyMzBMMyZkKtAIzKudDs68vd4bG2-haqGqlz64rmpgb4f7GKoYbaPrHfQljJUPXU8aWPt2CYNVJp6D01LV0V4c5gR8zB_eZ0_J4uXxeXa_SFaUiy7JpMFEm9JgpCWxGZWapQhRaZggjBREUZOaFFNJCiUw02VpOCoLSygTAhkyAXd773qjG2sK23ZB1fk6uEaFXe6Vy38nravypd_mVCJORNYLrg-C4D83NnZ542Jh61q11m9ijgUSiDHO-N8oF5xyKrH4D8pSwgVJe_Tq5wfH078rIV8pppOw |
ContentType | Journal Article |
Copyright | The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com. The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com 2015 |
Copyright_xml | – notice: The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com. – notice: The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com 2015 |
DBID | CGR CUY CVF ECM EIF NPM 7QO 7TM 8FD FR3 P64 7X8 7SC JQ2 L7M L~C L~D 5PM |
DOI | 10.1093/bioinformatics/btv688 |
DatabaseName | Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Biotechnology Research Abstracts Nucleic Acids Abstracts Technology Research Database Engineering Research Database Biotechnology and BioEngineering Abstracts MEDLINE - Academic Computer and Information Systems Abstracts ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional PubMed Central (Full Participant titles) |
DatabaseTitle | MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Engineering Research Database Biotechnology Research Abstracts Technology Research Database Nucleic Acids Abstracts Biotechnology and BioEngineering Abstracts MEDLINE - Academic Computer and Information Systems Abstracts Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Engineering Research Database MEDLINE Computer and Information Systems Abstracts MEDLINE - Academic |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Biology |
EISSN | 1367-4811 1460-2059 |
EndPage | 1015 |
ExternalDocumentID | PMC4907386 26589280 |
Genre | Journal Article |
GroupedDBID | --- -E4 -~X .2P .DC .I3 0R~ 1TH 23N 2WC 4.4 48X 53G 5GY 5WA 70D AAIJN AAIMJ AAJKP AAJQQ AAKPC AAMDB AAMVS AAOGV AAPQZ AAPXW AAUQX AAVAP AAVLN ABEJV ABEUO ABGNP ABIXL ABNKS ABPQP ABPTD ABQLI ABWST ABXVV ABZBJ ACGFS ACIWK ACPRK ACUFI ACUXJ ACYTK ADBBV ADEYI ADEZT ADFTL ADGKP ADGZP ADHKW ADHZD ADMLS ADOCK ADPDF ADRDM ADRTK ADVEK ADYVW ADZTZ ADZXQ AECKG AEGPL AEJOX AEKKA AEKSI AELWJ AEMDU AENEX AENZO AEPUE AETBJ AEWNT AFFZL AFGWE AFIYH AFOFC AFRAH AGINJ AGKEF AGQXC AGSYK AHMBA AHXPO AIJHB AJEEA AJEUX AKHUL AKWXX ALMA_UNASSIGNED_HOLDINGS ALTZX ALUQC AMNDL APIBT APWMN ARIXL ASPBG AVWKF AXUDD AYOIW AZVOD BAWUL BAYMD BHONS BQDIO BQUQU BSWAC BTQHN C45 CDBKE CGR CS3 CUY CVF CZ4 DAKXR DIK DILTD DU5 D~K EBD EBS ECM EE~ EIF EJD EMOBN F5P F9B FEDTE FHSFR FLIZI FLUFQ FOEOM FQBLK GAUVT GJXCC GROUPED_DOAJ GX1 H13 H5~ HAR HW0 HZ~ IOX J21 JXSIZ KAQDR KOP KQ8 KSI KSN M-Z M49 MK~ ML0 N9A NGC NLBLG NMDNZ NOMLY NPM NU- NVLIB O9- OAWHX ODMLO OJQWA OK1 OVD OVEED P2P PAFKI PEELM PQQKQ Q1. Q5Y R44 RD5 RNS ROL RPM RUSNO RW1 RXO SV3 TEORI TJP TLC TOX TR2 W8F WOQ X7H YAYTL YKOAZ YXANX ZKX ~91 ~KM 7QO 7TM 8FD ABJNI FR3 P64 ROZ TN5 WH7 7X8 7SC JQ2 L7M L~C L~D 5PM |
ID | FETCH-LOGICAL-j478t-69d13bdfd10b93e649b520049d58353c3a4d2d21493ca815bffd70fce345880d3 |
ISSN | 1367-4803 1367-4811 |
IngestDate | Thu Aug 21 14:08:47 EDT 2025 Sun Aug 24 03:49:08 EDT 2025 Thu Jul 10 18:22:12 EDT 2025 Fri Jul 11 13:43:13 EDT 2025 Thu Apr 03 07:06:56 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 7 |
Language | English |
License | The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com. |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-j478t-69d13bdfd10b93e649b520049d58353c3a4d2d21493ca815bffd70fce345880d3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Associate Editor: Inanc Birol |
OpenAccessLink | https://academic.oup.com/bioinformatics/article-pdf/32/7/1009/19568450/btv688.pdf |
PMID | 26589280 |
PQID | 1785237832 |
PQPubID | 23462 |
PageCount | 7 |
ParticipantIDs | pubmedcentral_primary_oai_pubmedcentral_nih_gov_4907386 proquest_miscellaneous_1808055757 proquest_miscellaneous_1787474918 proquest_miscellaneous_1785237832 pubmed_primary_26589280 |
PublicationCentury | 2000 |
PublicationDate | 2016-04-01 |
PublicationDateYYYYMMDD | 2016-04-01 |
PublicationDate_xml | – month: 04 year: 2016 text: 2016-04-01 day: 01 |
PublicationDecade | 2010 |
PublicationPlace | England |
PublicationPlace_xml | – name: England |
PublicationTitle | Bioinformatics |
PublicationTitleAlternate | Bioinformatics |
PublicationYear | 2016 |
Publisher | Oxford University Press |
Publisher_xml | – name: Oxford University Press |
SSID | ssj0005056 ssj0051444 |
Score | 2.6362052 |
Snippet | Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads.... Motivation: Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and... Motivation: Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and... |
SourceID | pubmedcentral proquest pubmed |
SourceType | Open Access Repository Aggregation Database Index Database |
StartPage | 1009 |
SubjectTerms | Algorithms Assembling Assembly Bacteria Base Sequence Benchmarking Bioinformatics Chromosome Mapping Cost engineering Genome Genomes Original Papers Sequence Analysis, DNA |
Title | hybridSPAdes: an algorithm for hybrid assembly of short and long reads |
URI | https://www.ncbi.nlm.nih.gov/pubmed/26589280 https://www.proquest.com/docview/1785237832 https://www.proquest.com/docview/1787474918 https://www.proquest.com/docview/1808055757 https://pubmed.ncbi.nlm.nih.gov/PMC4907386 |
Volume | 32 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3db9MwELdgCIkXxDfdABmJNxTm2M6HeavQqgnKVolU6lsUxw4NtMlE00rdX885TptkqqbBS5TaVq34Lpf7_B1CHzxl2jcE0kkzyh2uksyRLFFOyHRClFYyrNH1v1_451P-debNblSXVPJTen2wruR_qApjQFdTJfsPlN3_KQzAPdAXrkBhuN6JxvOtqbf6MRkqm9kG72qy-FmCvT9f1vmDdsFHUJD1Ui7qYPpqDgp3HTJY1G2GgMarXmQ3Lxsw1aqTCD8sqvyq3NQSaplXbe7wt9LUE22L_LedHZqexK2zb6ytg7UpGGsdrRO9uW5qbSbJRi8an2rjf3D9TtqKFZnMIKeHxIopbce4T4BSDdZ3I2dbP-Z6lx1shaZLiOh8gEFIeAeFuwW-kr1jMAPVxretAftw2heX8Wg6HsfR2Sy6jx5QsCOMIIwuZ20OEDFIQvYHaI7ctkBunmdX7yXYaX_PU7vjIZvkZmptR1eJnqDHjZGBh5ZjnqJ7uniGHtq2o9vnaNTlm884KfCeazBsj-003nENLjNccw2sVNhwDa655gWajs6iL-dO00_D-cWDsHJ8oVwmVaZcIgXTPhfSgG5xoTzQw1nKEq6oomAzszQJXU9mmQpIlmpmypmJYi_RUVEW-jXCIqMkzEyQlSdc01R4VEsB1rEE9RDUrgF6vzubGOSVCUIlhS7Xq9gNQo8ykA_01jVg5XLhhresMYCoHhgbwQC9smceX1mAlpiCWi1oSAYo6FFjv8BgqvdninxeY6tzQUwb3OM77HuCHrXvwxt0VP1Z67egoVbyXc1lfwFtCpc3 |
linkProvider | Oxford University Press |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=hybridSPAdes%3A+an+algorithm+for+hybrid+assembly+of+short+and+long+reads&rft.jtitle=Bioinformatics&rft.au=Antipov%2C+Dmitry&rft.au=Korobeynikov%2C+Anton&rft.au=McLean%2C+Jeffrey+S&rft.au=Pevzner%2C+Pavel+A&rft.date=2016-04-01&rft.issn=1367-4803&rft.eissn=1460-2059&rft.volume=32&rft.issue=7&rft.spage=1009&rft.epage=1015&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbtv688&rft.externalDBID=NO_FULL_TEXT |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon |