Assembling single-cell genomes and mini-metagenomes from chimeric MDA products
Recent advances in single-cell genomics provide an alternative to largely gene-centric metagenomics studies, enabling whole-genome sequencing of uncultivated bacteria. However, single-cell assembly projects are challenging due to (i) the highly nonuniform read coverage and (ii) a greatly elevated nu...
Saved in:
Published in | Journal of computational biology Vol. 20; no. 10; p. 714 |
---|---|
Main Authors | , , , , , , , , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
United States
01.10.2013
|
Subjects | |
Online Access | Get more information |
ISSN | 1557-8666 |
DOI | 10.1089/cmb.2013.0084 |
Cover
Abstract | Recent advances in single-cell genomics provide an alternative to largely gene-centric metagenomics studies, enabling whole-genome sequencing of uncultivated bacteria. However, single-cell assembly projects are challenging due to (i) the highly nonuniform read coverage and (ii) a greatly elevated number of chimeric reads and read pairs. While recently developed single-cell assemblers have addressed the former challenge, methods for assembling highly chimeric reads remain poorly explored. We present algorithms for identifying chimeric edges and resolving complex bulges in de Bruijn graphs, which significantly improve single-cell assemblies. We further describe applications of the single-cell assembler SPAdes to a new approach for capturing and sequencing "microbial dark matter" that forms small pools of randomly selected single cells (called a mini-metagenome) and further sequences all genomes from the mini-metagenome at once. On single-cell bacterial datasets, SPAdes improves on the recently developed E+V-SC and IDBA-UD assemblers specifically designed for single-cell sequencing. For standard (cultivated monostrain) datasets, SPAdes also improves on A5, ABySS, CLC, EULER-SR, Ray, SOAPdenovo, and Velvet. Thus, recently developed single-cell assemblers not only enable single-cell sequencing, but also improve on conventional assemblers on their own turf. SPAdes is available for free online download under a GPLv2 license. |
---|---|
AbstractList | Recent advances in single-cell genomics provide an alternative to largely gene-centric metagenomics studies, enabling whole-genome sequencing of uncultivated bacteria. However, single-cell assembly projects are challenging due to (i) the highly nonuniform read coverage and (ii) a greatly elevated number of chimeric reads and read pairs. While recently developed single-cell assemblers have addressed the former challenge, methods for assembling highly chimeric reads remain poorly explored. We present algorithms for identifying chimeric edges and resolving complex bulges in de Bruijn graphs, which significantly improve single-cell assemblies. We further describe applications of the single-cell assembler SPAdes to a new approach for capturing and sequencing "microbial dark matter" that forms small pools of randomly selected single cells (called a mini-metagenome) and further sequences all genomes from the mini-metagenome at once. On single-cell bacterial datasets, SPAdes improves on the recently developed E+V-SC and IDBA-UD assemblers specifically designed for single-cell sequencing. For standard (cultivated monostrain) datasets, SPAdes also improves on A5, ABySS, CLC, EULER-SR, Ray, SOAPdenovo, and Velvet. Thus, recently developed single-cell assemblers not only enable single-cell sequencing, but also improve on conventional assemblers on their own turf. SPAdes is available for free online download under a GPLv2 license. |
Author | Antipov, Dmitry Woyke, Tanja Lapidus, Alla Bankevich, Anton Sirotkin, Yakov Lasken, Roger Sirotkin, Alexander Pyshkin, Alexey Korobeynikov, Anton Tesler, Glenn Gurevich, Alexey A Pevzner, Pavel A Clingenpeel, Scott R McLean, Jeffrey S Alekseyev, Max A Stepanauskas, Ramunas Nurk, Sergey Prjibelski, Andrey D |
Author_xml | – sequence: 1 givenname: Sergey surname: Nurk fullname: Nurk, Sergey organization: 1 Algorithmic Biology Laboratory, St. Petersburg Academic University , Russian Academy of Sciences, St. Petersburg, Russia – sequence: 2 givenname: Anton surname: Bankevich fullname: Bankevich, Anton – sequence: 3 givenname: Dmitry surname: Antipov fullname: Antipov, Dmitry – sequence: 4 givenname: Alexey A surname: Gurevich fullname: Gurevich, Alexey A – sequence: 5 givenname: Anton surname: Korobeynikov fullname: Korobeynikov, Anton – sequence: 6 givenname: Alla surname: Lapidus fullname: Lapidus, Alla – sequence: 7 givenname: Andrey D surname: Prjibelski fullname: Prjibelski, Andrey D – sequence: 8 givenname: Alexey surname: Pyshkin fullname: Pyshkin, Alexey – sequence: 9 givenname: Alexander surname: Sirotkin fullname: Sirotkin, Alexander – sequence: 10 givenname: Yakov surname: Sirotkin fullname: Sirotkin, Yakov – sequence: 11 givenname: Ramunas surname: Stepanauskas fullname: Stepanauskas, Ramunas – sequence: 12 givenname: Scott R surname: Clingenpeel fullname: Clingenpeel, Scott R – sequence: 13 givenname: Tanja surname: Woyke fullname: Woyke, Tanja – sequence: 14 givenname: Jeffrey S surname: McLean fullname: McLean, Jeffrey S – sequence: 15 givenname: Roger surname: Lasken fullname: Lasken, Roger – sequence: 16 givenname: Glenn surname: Tesler fullname: Tesler, Glenn – sequence: 17 givenname: Max A surname: Alekseyev fullname: Alekseyev, Max A – sequence: 18 givenname: Pavel A surname: Pevzner fullname: Pevzner, Pavel A |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/24093227$$D View this record in MEDLINE/PubMed |
BookMark | eNo1j0tLxDAYRYMozkOXbiV_IDX5mjTJsoxPGHWj6yHNY4w0aWk6C_-9Azqbe-FcOHBX6DwP2SN0w2jFqNJ3NnUVUFZXlCp-hpZMCElU0zQLtCrlmx6nhspLtABOdQ0gl-itLcWnro95j8sxek-s73u893lIvmCTHU4xR5L8bE4wTEPC9ismP0WLX-9bPE6DO9i5XKGLYPrir_97jT4fHz42z2T7_vSyabdkBNAzCdqCFkEJzpmRnDrNG6FABiFYoCA7JqjoOFNM1sC8bsBIChw0d5Y7LmCNbv-846FL3u3GKSYz_exOx-AXbM5OAg |
ContentType | Journal Article |
DBID | CGR CUY CVF ECM EIF NPM |
DOI | 10.1089/cmb.2013.0084 |
DatabaseName | Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed |
DatabaseTitle | MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) |
DatabaseTitleList | MEDLINE |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database |
DeliveryMethod | no_fulltext_linktorsrc |
Discipline | Biology Mathematics |
EISSN | 1557-8666 |
ExternalDocumentID | 24093227 |
Genre | Research Support, U.S. Gov't, Non-P.H.S Research Support, Non-U.S. Gov't Journal Article Research Support, N.I.H., Extramural |
GrantInformation_xml | – fundername: NIGMS NIH HHS grantid: 1R01GM095373 – fundername: NCRR NIH HHS grantid: 3P41RR024851-02S1 – fundername: NHGRI NIH HHS grantid: 2R01HG003647 – fundername: NIGMS NIH HHS grantid: R01 GM095373 |
GroupedDBID | --- 0R~ 29K 34G 39C 4.4 53G 5GY ABBKN ABEFU ACGFO ADBBV AENEX AFOSN AI. ALMA_UNASSIGNED_HOLDINGS BAWUL BNQNF CAG CGR COF CS3 CUY CVF D-I DIK DU5 EBS ECM EIF EJD F5P IAO IER IGS IHR IM4 ITC MV1 NPM NQHIM O9- P2P R.V RIG RML RMSOB RNS TN5 TR2 UE5 VH1 |
ID | FETCH-LOGICAL-p229t-f9c295f85441a740d9465827f551f027b1505b41817321e962a7024294dc4d452 |
IngestDate | Thu Apr 03 06:56:45 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 10 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-p229t-f9c295f85441a740d9465827f551f027b1505b41817321e962a7024294dc4d452 |
PMID | 24093227 |
ParticipantIDs | pubmed_primary_24093227 |
PublicationCentury | 2000 |
PublicationDate | 2013-10-01 |
PublicationDateYYYYMMDD | 2013-10-01 |
PublicationDate_xml | – month: 10 year: 2013 text: 2013-10-01 day: 01 |
PublicationDecade | 2010 |
PublicationPlace | United States |
PublicationPlace_xml | – name: United States |
PublicationTitle | Journal of computational biology |
PublicationTitleAlternate | J Comput Biol |
PublicationYear | 2013 |
References | 20958248 - J Comput Biol. 2010 Nov;17(11):1519-33 22890147 - Nat Rev Microbiol. 2012 Sep;10(9):631-40 22719823 - PLoS One. 2012;7(6):e32118 17430586 - BMC Biotechnol. 2007;7:19 19390573 - PLoS One. 2009;4(4):e5299 22962446 - Bioinformatics. 2012 Sep 15;28(18):i311-i317 20489017 - Science. 2010 May 21;328(5981):994-9 23028432 - PLoS One. 2012;7(9):e42304 16304596 - Nat Rev Genet. 2005 Nov;6(11):805-14 23422339 - Bioinformatics. 2013 Apr 15;29(8):1072-5 23754396 - Proc Natl Acad Sci U S A. 2013 Jun 25;110(26):E2390-9 22495754 - Bioinformatics. 2012 Jun 1;28(11):1420-8 17923430 - Curr Opin Microbiol. 2007 Oct;10(5):510-6 22506599 - J Comput Biol. 2012 May;19(5):455-77 22028825 - PLoS One. 2011;6(10):e26161 21926975 - Nat Biotechnol. 2011 Oct;29(10):915-21 20019144 - Genome Res. 2010 Feb;20(2):265-72 21304637 - Stand Genomic Sci. 2009 Jul 20;1(1):54-62 23026140 - Curr Opin Microbiol. 2012 Oct;15(5):613-20 19251739 - Genome Res. 2009 Jun;19(6):1117-23 21304689 - Stand Genomic Sci. 2010 Jul 29;3(1):26-36 22803627 - J Comput Biol. 2013 Apr;20(4):359-71 22699609 - Nature. 2012 Jun 14;486(7402):207-14 22068540 - Nat Biotechnol. 2011 Nov;29(11):987-91 14527284 - Annu Rev Microbiol. 2003;57:369-94 23525359 - Genome Res. 2013 May;23(5):855-66 23564253 - Genome Res. 2013 May;23(5):867-77 22719826 - PLoS One. 2012;7(6):e35294 19056694 - Genome Res. 2009 Feb;19(2):336-46 12917642 - Nature. 2003 Aug 28;424(6952):1042-7 18349386 - Genome Res. 2008 May;18(5):821-9 9278503 - Science. 1997 Sep 5;277(5331):1453-62 |
References_xml | – reference: 20958248 - J Comput Biol. 2010 Nov;17(11):1519-33 – reference: 16304596 - Nat Rev Genet. 2005 Nov;6(11):805-14 – reference: 19251739 - Genome Res. 2009 Jun;19(6):1117-23 – reference: 14527284 - Annu Rev Microbiol. 2003;57:369-94 – reference: 22506599 - J Comput Biol. 2012 May;19(5):455-77 – reference: 23754396 - Proc Natl Acad Sci U S A. 2013 Jun 25;110(26):E2390-9 – reference: 23564253 - Genome Res. 2013 May;23(5):867-77 – reference: 18349386 - Genome Res. 2008 May;18(5):821-9 – reference: 22962446 - Bioinformatics. 2012 Sep 15;28(18):i311-i317 – reference: 21304689 - Stand Genomic Sci. 2010 Jul 29;3(1):26-36 – reference: 23422339 - Bioinformatics. 2013 Apr 15;29(8):1072-5 – reference: 22890147 - Nat Rev Microbiol. 2012 Sep;10(9):631-40 – reference: 21304637 - Stand Genomic Sci. 2009 Jul 20;1(1):54-62 – reference: 22719823 - PLoS One. 2012;7(6):e32118 – reference: 17430586 - BMC Biotechnol. 2007;7:19 – reference: 22028825 - PLoS One. 2011;6(10):e26161 – reference: 19056694 - Genome Res. 2009 Feb;19(2):336-46 – reference: 9278503 - Science. 1997 Sep 5;277(5331):1453-62 – reference: 23028432 - PLoS One. 2012;7(9):e42304 – reference: 22495754 - Bioinformatics. 2012 Jun 1;28(11):1420-8 – reference: 17923430 - Curr Opin Microbiol. 2007 Oct;10(5):510-6 – reference: 22719826 - PLoS One. 2012;7(6):e35294 – reference: 20019144 - Genome Res. 2010 Feb;20(2):265-72 – reference: 20489017 - Science. 2010 May 21;328(5981):994-9 – reference: 23026140 - Curr Opin Microbiol. 2012 Oct;15(5):613-20 – reference: 23525359 - Genome Res. 2013 May;23(5):855-66 – reference: 22068540 - Nat Biotechnol. 2011 Nov;29(11):987-91 – reference: 22699609 - Nature. 2012 Jun 14;486(7402):207-14 – reference: 19390573 - PLoS One. 2009;4(4):e5299 – reference: 21926975 - Nat Biotechnol. 2011 Oct;29(10):915-21 – reference: 22803627 - J Comput Biol. 2013 Apr;20(4):359-71 – reference: 12917642 - Nature. 2003 Aug 28;424(6952):1042-7 |
SSID | ssj0013607 |
Score | 2.596643 |
Snippet | Recent advances in single-cell genomics provide an alternative to largely gene-centric metagenomics studies, enabling whole-genome sequencing of uncultivated... |
SourceID | pubmed |
SourceType | Index Database |
StartPage | 714 |
SubjectTerms | Algorithms Base Composition Computational Biology Contig Mapping - methods DNA, Bacterial - genetics DNA, Concatenated - genetics Escherichia coli - genetics Gene Library Genome, Bacterial High-Throughput Nucleotide Sequencing Nucleic Acid Amplification Techniques Pedobacter - genetics Prochlorococcus - genetics Sequence Analysis, DNA Single-Cell Analysis |
Title | Assembling single-cell genomes and mini-metagenomes from chimeric MDA products |
URI | https://www.ncbi.nlm.nih.gov/pubmed/24093227 |
Volume | 20 |
hasFullText | |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1NT-MwELUKaCU4IGD5_pAPe_WSuk6cHCvYXYTUXgCJG4odW1uJtJUICDjw25mx4yaUZbXLJarspkryXsfjycwbQr4VudSJlgmzAC8T1hqmYNViokhzbsEaqgLjHYNhcnYlzq_j607npV1dUqnv-vmPdSWfQRXGAFeskv0PZGc_CgPwGfCFIyAMx3_CGN_YlsoVlOOW_9YwjMNjV-QJajBhTBylQ1hpqjwMunoS_Xvkc-gHp31M0ULR17sP_FTt-j6EmGEt2jSLId_7VOsLLOJsaiPyMdzLyHeZ6mOX4ibaUI2mkwdn6cpR1eQg_0IN4XDGrXkEU9VvRyS6TW4bLCi1FY1h6Ut8O5VgZnnUplPUMprSl5G-M-ZRilqoulSYgeeEaN98D7CYlg5Z8ErADfUaA3-fndPWDlMLZEFKNOtDjPWEd1BJJGtVVriS4zfXgRrS9blz-xHnl1yukdUaKNr37FgnHTPeIF98i9GnDbIymOny3n0lw4YxtMUYWpODAmPoPGMoMoYGxlBgDA2M2SRXP39cnpyxuqMGm3KeVcxmmmexTbHxXC5FVGQCPFAuLfjNNuJSwfYgVgK8PtnjXZMlPJfoxGWi0KIQMd8ii-PJ2OwQmiXGCNlThc17IrY2NanNu1JrVSjRU2aXbPvHcjP1sik34YHtfTizT5Ydn1w29AFZsvA_NYfg9FXqyGHzCsZLWKk |
linkProvider | National Library of Medicine |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Assembling+single-cell+genomes+and+mini-metagenomes+from+chimeric+MDA+products&rft.jtitle=Journal+of+computational+biology&rft.au=Nurk%2C+Sergey&rft.au=Bankevich%2C+Anton&rft.au=Antipov%2C+Dmitry&rft.au=Gurevich%2C+Alexey+A&rft.date=2013-10-01&rft.eissn=1557-8666&rft.volume=20&rft.issue=10&rft.spage=714&rft_id=info:doi/10.1089%2Fcmb.2013.0084&rft_id=info%3Apmid%2F24093227&rft_id=info%3Apmid%2F24093227&rft.externalDocID=24093227 |