Centrifuger: lossless compression of microbial genomes for efficient and accurate metagenomic sequence classification
Centrifuger is an efficient taxonomic classification method that compares sequencing reads against a microbial genome database. In Centrifuger, the Burrows-Wheeler transformed genome sequences are losslessly compressed using a novel scheme called run-block compression. Run-block compression achieves...
Saved in:
Published in | Genome Biology Vol. 25; no. 1; p. 106 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
England
BioMed Central
25.04.2024
BMC |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Centrifuger is an efficient taxonomic classification method that compares sequencing reads against a microbial genome database. In Centrifuger, the Burrows-Wheeler transformed genome sequences are losslessly compressed using a novel scheme called run-block compression. Run-block compression achieves sublinear space complexity and is effective at compressing diverse microbial databases like RefSeq while supporting fast rank queries. Combining this compression method with other strategies for compacting the Ferragina-Manzini (FM) index, Centrifuger reduces the memory footprint by half compared to other FM-index-based approaches. Furthermore, the lossless compression and the unconstrained match length help Centrifuger achieve greater accuracy than competing methods at lower taxonomic levels. |
---|---|
AbstractList | Centrifuger is an efficient taxonomic classification method that compares sequencing reads against a microbial genome database. In Centrifuger, the Burrows-Wheeler transformed genome sequences are losslessly compressed using a novel scheme called run-block compression. Run-block compression achieves sublinear space complexity and is effective at compressing diverse microbial databases like RefSeq while supporting fast rank queries. Combining this compression method with other strategies for compacting the Ferragina-Manzini (FM) index, Centrifuger reduces the memory footprint by half compared to other FM-index-based approaches. Furthermore, the lossless compression and the unconstrained match length help Centrifuger achieve greater accuracy than competing methods at lower taxonomic levels. Abstract Centrifuger is an efficient taxonomic classification method that compares sequencing reads against a microbial genome database. In Centrifuger, the Burrows-Wheeler transformed genome sequences are losslessly compressed using a novel scheme called run-block compression. Run-block compression achieves sublinear space complexity and is effective at compressing diverse microbial databases like RefSeq while supporting fast rank queries. Combining this compression method with other strategies for compacting the Ferragina-Manzini (FM) index, Centrifuger reduces the memory footprint by half compared to other FM-index-based approaches. Furthermore, the lossless compression and the unconstrained match length help Centrifuger achieve greater accuracy than competing methods at lower taxonomic levels. Centrifuger is an efficient taxonomic classification method that compares sequencing reads against a microbial genome database. In Centrifuger, the Burrows-Wheeler transformed genome sequences are losslessly compressed using a novel scheme called run-block compression. Run-block compression achieves sublinear space complexity and is effective at compressing diverse microbial databases like RefSeq while supporting fast rank queries. Combining this compression method with other strategies for compacting the Ferragina-Manzini (FM) index, Centrifuger reduces the memory footprint by half compared to other FM-index-based approaches. Furthermore, the lossless compression and the unconstrained match length help Centrifuger achieve greater accuracy than competing methods at lower taxonomic levels.Centrifuger is an efficient taxonomic classification method that compares sequencing reads against a microbial genome database. In Centrifuger, the Burrows-Wheeler transformed genome sequences are losslessly compressed using a novel scheme called run-block compression. Run-block compression achieves sublinear space complexity and is effective at compressing diverse microbial databases like RefSeq while supporting fast rank queries. Combining this compression method with other strategies for compacting the Ferragina-Manzini (FM) index, Centrifuger reduces the memory footprint by half compared to other FM-index-based approaches. Furthermore, the lossless compression and the unconstrained match length help Centrifuger achieve greater accuracy than competing methods at lower taxonomic levels. |
ArticleNumber | 106 |
Author | Song, Li Langmead, Ben |
Author_xml | – sequence: 1 givenname: Li orcidid: 0000-0002-0180-7426 surname: Song fullname: Song, Li – sequence: 2 givenname: Ben surname: Langmead fullname: Langmead, Ben |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/38664753$$D View this record in MEDLINE/PubMed |
BookMark | eNqFkkuLFDEUhYOMOA_9Ay4k4MZNaW6eVe6k8TEw4EbBXUhSN02aqsqYVC3892a6Z0RmoatcwncO93EuydmSFyTkJbC3AL1-V0EwNXSMy44JLmUnn5ALkEZ2RrMfZ3_V5-Sy1gNjMEiun5Fz0WstjRIXZNvhspYUtz2W93TKtU5YKw15vi2tSHmhOdI5hZJ9chPd45JnrDTmQjHGFFLTU7eM1IWwFbcinXF1RywFWvHnhktAGibX3Brv1ub5nDyNbqr44v69It8_ffy2-9LdfP18vftw0wXNhrUzxuvQAwTtRzY6xxgbRRgiBCeE10pJoTzveW_iMPQQndJo4shRSA2gjbgi1yffMbuDvS1pduWXzS7Z40cue-vKmsKEVnkUogchBsdkdN5DAIkQPIPYS8Tm9ebkdVtyG6qudk414DS5BfNWrQAlWkuGi_-jTJpBSgF3Hb5-hB7yVpa2lEYpyYGzQTXq1T21-RnHP5M83LEB_Qlod6q1YLQhrcdVr8WlyQKzd5Gxp8jYFhl7jIyVTcofSR_c_yH6Dbc_w9M |
CitedBy_id | crossref_primary_10_1038_s41467_025_57088_y |
Cites_doi | 10.1038/s41591-019-0405-7 10.1038/s41587-023-01688-w 10.1007/978-3-642-28332-1_21 10.1101/2022.05.19.492613 10.5281/zenodo.10938378 10.1186/s13059-019-1891-0 10.1038/s41592-022-01431-4 10.1007/978-3-031-20643-6_14 10.1016/j.tcs.2013.10.019 10.1093/bioinformatics/btx106 10.1093/bioinformatics/bth408 10.1038/s41579-018-0029-9 10.1038/nmeth.2066 10.7717/peerj-cs.104 10.1093/bioinformatics/btx067 10.1089/cmb.2006.13.1028 10.1111/j.2517-6161.1977.tb01600.x 10.1093/bioinformatics/btu541 10.1186/s13059-018-1554-6 10.1093/bioinformatics/bts280 10.1093/bioinformatics/btad233 10.1101/2023.02.27.530134 10.1093/nar/gkl842 10.3389/fmicb.2021.766364 10.1186/gb-2014-15-3-r46 10.1186/s12864-015-1419-2 10.1101/gr.210641.116 10.1093/nar/gks1195 10.1093/bioinformatics/btac845 10.1101/2023.11.20.567879 10.1038/s41467-021-26266-z 10.1093/bioinformatics/btaa458 10.1016/j.tcs.2007.07.018 10.1038/nrg1709 10.1016/j.tcs.2012.02.006 10.1038/s41467-019-10934-2 10.1101/gr.277642.123 10.1093/nar/gkab776 10.1109/SFCS.2000.892127 10.1038/s41576-019-0113-7 10.1007/s00453-018-0475-9 10.1093/bioinformatics/btr708 10.1089/cmb.2009.0169 10.1101/2023.07.20.549822 10.1186/s13059-022-02610-4 10.1186/gb-2009-10-3-r25 10.1101/2023.12.07.570547 10.1038/ncomms11257 |
ContentType | Journal Article |
Copyright | 2024. The Author(s). 2024. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
Copyright_xml | – notice: 2024. The Author(s). – notice: 2024. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
DBID | AAYXX CITATION CGR CUY CVF ECM EIF NPM 3V. 7X7 7XB 88E 8FE 8FH 8FI 8FJ 8FK ABUWG AFKRA AZQEC BBNVY BENPR BHPHI CCPQU COVID DWQXO FYUFA GHDGH GNUQQ HCIFZ K9. LK8 M0S M1P M7P PHGZM PHGZT PIMPY PJZUB PKEHL PPXIY PQEST PQGLB PQQKQ PQUKI PRINS 7X8 7S9 L.6 DOA |
DOI | 10.1186/s13059-024-03244-4 |
DatabaseName | CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed ProQuest Central (Corporate) Health & Medical Collection ProQuest Central (purchase pre-March 2016) Medical Database (Alumni Edition) ProQuest SciTech Collection ProQuest Natural Science Collection Hospital Premium Collection Hospital Premium Collection (Alumni Edition) ProQuest Central (Alumni) (purchase pre-March 2016) ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials Biological Science Database ProQuest Central Natural Science Collection ProQuest One Community College Coronavirus Research Database ProQuest Central Korea Health Research Premium Collection Health Research Premium Collection (Alumni) ProQuest Central Student SciTech Premium Collection ProQuest Health & Medical Complete (Alumni) Biological Sciences ProQuest Health & Medical Collection Medical Database Biological Science Database ProQuest Central Premium ProQuest One Academic Publicly Available Content Database ProQuest Health & Medical Research Collection ProQuest One Academic Middle East (New) ProQuest One Health & Nursing ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China MEDLINE - Academic AGRICOLA AGRICOLA - Academic DOAJ (Directory of Open Access Journals) |
DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Publicly Available Content Database ProQuest Central Student ProQuest One Academic Middle East (New) ProQuest Central Essentials ProQuest Health & Medical Complete (Alumni) ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest One Health & Nursing ProQuest Natural Science Collection ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest Health & Medical Research Collection Health Research Premium Collection Health and Medicine Complete (Alumni Edition) Natural Science Collection ProQuest Central Korea Health & Medical Research Collection Biological Science Collection ProQuest Central (New) ProQuest Medical Library (Alumni) ProQuest Biological Science Collection ProQuest One Academic Eastern Edition Coronavirus Research Database ProQuest Hospital Collection Health Research Premium Collection (Alumni) Biological Science Database ProQuest SciTech Collection ProQuest Hospital Collection (Alumni) ProQuest Health & Medical Complete ProQuest Medical Library ProQuest One Academic UKI Edition ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni) MEDLINE - Academic AGRICOLA AGRICOLA - Academic |
DatabaseTitleList | AGRICOLA Publicly Available Content Database CrossRef MEDLINE MEDLINE - Academic |
Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database – sequence: 4 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Biology |
EISSN | 1474-760X |
EndPage | 106 |
ExternalDocumentID | oai_doaj_org_article_5be3381339a04fabb1c14e1cb01f84ee 38664753 10_1186_s13059_024_03244_4 |
Genre | Research Support, U.S. Gov't, Non-P.H.S Research Support, Non-U.S. Gov't Journal Article Research Support, N.I.H., Extramural |
GrantInformation_xml | – fundername: NHGRI NIH HHS grantid: R01HG011392 – fundername: NIGMS NIH HHS grantid: P20GM130454 – fundername: NIGMS NIH HHS grantid: R35GM139602 – fundername: NIGMS NIH HHS grantid: R35 GM139602 – fundername: NIGMS NIH HHS grantid: P20 GM130454 – fundername: NHGRI NIH HHS grantid: R01 HG011392 – fundername: NIGMS NIH HHS grantid: 3P20GM130454-05WS |
GroupedDBID | --- 0R~ 29H 4.4 53G 5GY 5VS 7X7 88E 8FE 8FH 8FI 8FJ AAFWJ AAHBH AAJSJ AASML AAYXX ABUWG ACGFO ACGFS ACJQM ACPRK ADBBV ADUKV AEGXH AFKRA AFPKN AHBYD AIAGR ALIPV ALMA_UNASSIGNED_HOLDINGS AMKLP AMTXH AOIAM AOIJS BAPOH BAWUL BBNVY BCNDV BENPR BFQNJ BHPHI BMC BPHCQ BVXVI C6C CCPQU CITATION EBD EBLON EBS EMOBN FYUFA GROUPED_DOAJ GX1 HCIFZ HMCUK IAO IGS IHR ISR ITC KPI LK8 M1P M7P PHGZM PHGZT PIMPY PQQKQ PROAC PSQYO ROL RPM RSV SJN SOJ SV3 UKHRP CGR CUY CVF ECM EIF NPM 3V. 7XB 8FK AZQEC COVID DWQXO GNUQQ K9. PJZUB PKEHL PPXIY PQEST PQGLB PQUKI PRINS 7X8 7S9 L.6 PUEGO |
ID | FETCH-LOGICAL-c609t-77b6c811c6bd0daa000d3c9f1ca33b655435b28287f9981fa56e7fd2e34611673 |
IEDL.DBID | DOA |
ISSN | 1474-760X 1474-7596 |
IngestDate | Wed Aug 27 01:31:31 EDT 2025 Thu Jul 10 22:57:59 EDT 2025 Thu Jul 10 22:41:59 EDT 2025 Fri Jul 25 11:56:55 EDT 2025 Thu Apr 03 07:03:42 EDT 2025 Tue Jul 01 03:11:12 EDT 2025 Thu Apr 24 22:51:24 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Keywords | FM-index r-index Compact data structure Metagenomic |
Language | English |
License | 2024. The Author(s). |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c609t-77b6c811c6bd0daa000d3c9f1ca33b655435b28287f9981fa56e7fd2e34611673 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ORCID | 0000-0002-0180-7426 |
OpenAccessLink | https://doaj.org/article/5be3381339a04fabb1c14e1cb01f84ee |
PMID | 38664753 |
PQID | 3054212095 |
PQPubID | 2040232 |
PageCount | 1 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_5be3381339a04fabb1c14e1cb01f84ee proquest_miscellaneous_3153655723 proquest_miscellaneous_3047944317 proquest_journals_3054212095 pubmed_primary_38664753 crossref_citationtrail_10_1186_s13059_024_03244_4 crossref_primary_10_1186_s13059_024_03244_4 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2024-04-25 |
PublicationDateYYYYMMDD | 2024-04-25 |
PublicationDate_xml | – month: 04 year: 2024 text: 2024-04-25 day: 25 |
PublicationDecade | 2020 |
PublicationPlace | England |
PublicationPlace_xml | – name: England – name: London |
PublicationTitle | Genome Biology |
PublicationTitleAlternate | Genome Biol |
PublicationYear | 2024 |
Publisher | BioMed Central BMC |
Publisher_xml | – name: BioMed Central – name: BMC |
References | H Li (3244_CR42) 2012; 28 3244_CR57 3244_CR56 F Meyer (3244_CR35) 2022; 19 cr-split#-3244_CR19.2 cr-split#-3244_CR19.1 JN Alanko (3244_CR33) 2023; 39 3244_CR55 3244_CR2 AP Dempster (3244_CR46) 1977; 39 3244_CR50 S Kreft (3244_CR22) 2013; 483 3244_CR48 D Kim (3244_CR18) 2016; 26 AM Thomas (3244_CR5) 2019; 25 cr-split#-3244_CR20.2 cr-split#-3244_CR20.1 P Menzel (3244_CR40) 2016; 7 F De Filippis (3244_CR4) 2021; 12 W Huang (3244_CR31) 2012; 28 AT Dilthey (3244_CR36) 2019; 10 3244_CR44 SG Tringe (3244_CR1) 2005; 6 S Gog (3244_CR54) 2019; 81 R Knight (3244_CR6) 2018; 16 3244_CR37 VC Piro (3244_CR16) 2020; 36 3244_CR34 M Roberts (3244_CR11) 2004; 20 J Kärkkäinen (3244_CR52) 2007; 387 W Shen (3244_CR17) 2023; 39 KD Pruitt (3244_CR7) 2007; 35 J Lu (3244_CR45) 2017; 3 3244_CR30 R Ounit (3244_CR15) 2015; 16 V Mäkinen (3244_CR24) 2010; 17 A Morgulis (3244_CR49) 2006; 13 3244_CR29 3244_CR28 O Ahmed (3244_CR38) 2023; 33 G Skoufos (3244_CR47) 2022; 23 3244_CR27 DE Wood (3244_CR10) 2019; 20 3244_CR26 H Li (3244_CR41) 2014; 30 3244_CR25 3244_CR23 T Gagie (3244_CR39) 2022 CY Chiu (3244_CR3) 2019; 20 DH Parks (3244_CR9) 2022; 50 J Barbay (3244_CR51) 2013; 513 DA Benson (3244_CR8) 2013; 41 N Segata (3244_CR14) 2012; 9 A Blanco-Míguez (3244_CR13) 2023; 41 DJ Nasko (3244_CR21) 2018; 19 MD Muggli (3244_CR32) 2017; 33 DE Wood (3244_CR12) 2014; 15 L Schaeffer (3244_CR43) 2017; 33 B Langmead (3244_CR53) 2009; 10 38014029 - bioRxiv. 2023 Nov 17:2023.11.15.567129. doi: 10.1101/2023.11.15.567129 |
References_xml | – volume: 25 start-page: 667 year: 2019 ident: 3244_CR5 publication-title: Nat Med doi: 10.1038/s41591-019-0405-7 – ident: 3244_CR55 – volume: 41 start-page: 1633 year: 2023 ident: 3244_CR13 publication-title: Nat Biotechnol. doi: 10.1038/s41587-023-01688-w – ident: 3244_CR23 doi: 10.1007/978-3-642-28332-1_21 – ident: 3244_CR34 doi: 10.1101/2022.05.19.492613 – ident: 3244_CR56 doi: 10.5281/zenodo.10938378 – volume: 20 start-page: 257 year: 2019 ident: 3244_CR10 publication-title: Genome Biol doi: 10.1186/s13059-019-1891-0 – volume: 19 start-page: 429 year: 2022 ident: 3244_CR35 publication-title: Nat Methods doi: 10.1038/s41592-022-01431-4 – start-page: 191 volume-title: String Processing and Information Retrieval year: 2022 ident: 3244_CR39 doi: 10.1007/978-3-031-20643-6_14 – ident: #cr-split#-3244_CR19.2 – volume: 513 start-page: 109 year: 2013 ident: 3244_CR51 publication-title: Theoret Comput Sci doi: 10.1016/j.tcs.2013.10.019 – volume: 33 start-page: 2082 year: 2017 ident: 3244_CR43 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btx106 – volume: 20 start-page: 3363 year: 2004 ident: 3244_CR11 publication-title: Bioinformatics doi: 10.1093/bioinformatics/bth408 – volume: 16 start-page: 410 year: 2018 ident: 3244_CR6 publication-title: Nat Rev Microbiol doi: 10.1038/s41579-018-0029-9 – ident: 3244_CR27 – volume: 9 start-page: 811 year: 2012 ident: 3244_CR14 publication-title: Nat Methods doi: 10.1038/nmeth.2066 – volume: 3 year: 2017 ident: 3244_CR45 publication-title: PeerJ Comput Sci doi: 10.7717/peerj-cs.104 – volume: 33 start-page: 3181 year: 2017 ident: 3244_CR32 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btx067 – volume: 13 start-page: 1028 year: 2006 ident: 3244_CR49 publication-title: J Comput Biol doi: 10.1089/cmb.2006.13.1028 – volume: 39 start-page: 1 year: 1977 ident: 3244_CR46 publication-title: J Roy Stat Soc: Ser B (Methodol) doi: 10.1111/j.2517-6161.1977.tb01600.x – volume: 30 start-page: 3274 year: 2014 ident: 3244_CR41 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btu541 – volume: 19 start-page: 165 year: 2018 ident: 3244_CR21 publication-title: Genome Biol doi: 10.1186/s13059-018-1554-6 – volume: 28 start-page: 1838 year: 2012 ident: 3244_CR42 publication-title: Bioinformatics doi: 10.1093/bioinformatics/bts280 – volume: 39 start-page: i260 year: 2023 ident: 3244_CR33 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btad233 – ident: 3244_CR48 doi: 10.1101/2023.02.27.530134 – volume: 35 start-page: D61 year: 2007 ident: 3244_CR7 publication-title: Nucleic Acids Res doi: 10.1093/nar/gkl842 – ident: 3244_CR2 doi: 10.3389/fmicb.2021.766364 – volume: 15 start-page: R46 year: 2014 ident: 3244_CR12 publication-title: Genome Biol doi: 10.1186/gb-2014-15-3-r46 – volume: 16 start-page: 236 year: 2015 ident: 3244_CR15 publication-title: BMC Genomics doi: 10.1186/s12864-015-1419-2 – ident: #cr-split#-3244_CR19.1 – volume: 26 start-page: 1721 year: 2016 ident: 3244_CR18 publication-title: Genome Res doi: 10.1101/gr.210641.116 – volume: 41 start-page: D36 year: 2013 ident: 3244_CR8 publication-title: Nucleic Acids Res doi: 10.1093/nar/gks1195 – volume: 39 start-page: btac845 year: 2023 ident: 3244_CR17 publication-title: Bioinformatics. doi: 10.1093/bioinformatics/btac845 – ident: 3244_CR28 – ident: 3244_CR30 – ident: 3244_CR44 doi: 10.1101/2023.11.20.567879 – volume: 12 start-page: 5958 year: 2021 ident: 3244_CR4 publication-title: Nat Commun doi: 10.1038/s41467-021-26266-z – volume: 36 start-page: i12 year: 2020 ident: 3244_CR16 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btaa458 – ident: 3244_CR57 – ident: #cr-split#-3244_CR20.2 – volume: 387 start-page: 249 year: 2007 ident: 3244_CR52 publication-title: Theoret Comput Sci doi: 10.1016/j.tcs.2007.07.018 – volume: 6 start-page: 805 year: 2005 ident: 3244_CR1 publication-title: Nat Rev Genet doi: 10.1038/nrg1709 – volume: 483 start-page: 115 year: 2013 ident: 3244_CR22 publication-title: Theoret Comput Sci doi: 10.1016/j.tcs.2012.02.006 – volume: 10 start-page: 3066 year: 2019 ident: 3244_CR36 publication-title: Nat Commun doi: 10.1038/s41467-019-10934-2 – volume: 33 start-page: 1069 issue: 7 year: 2023 ident: 3244_CR38 publication-title: Genome Res doi: 10.1101/gr.277642.123 – ident: 3244_CR25 – volume: 50 start-page: D785 year: 2022 ident: 3244_CR9 publication-title: Nucleic Acids Res doi: 10.1093/nar/gkab776 – ident: #cr-split#-3244_CR20.1 doi: 10.1109/SFCS.2000.892127 – ident: 3244_CR29 – volume: 20 start-page: 341 year: 2019 ident: 3244_CR3 publication-title: Nat Rev Genet doi: 10.1038/s41576-019-0113-7 – volume: 81 start-page: 1370 year: 2019 ident: 3244_CR54 publication-title: Algorithmica doi: 10.1007/s00453-018-0475-9 – volume: 28 start-page: 593 year: 2012 ident: 3244_CR31 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btr708 – volume: 17 start-page: 281 year: 2010 ident: 3244_CR24 publication-title: J Comput Biol doi: 10.1089/cmb.2009.0169 – ident: 3244_CR37 doi: 10.1101/2023.07.20.549822 – volume: 23 start-page: 39 year: 2022 ident: 3244_CR47 publication-title: Genome Biol doi: 10.1186/s13059-022-02610-4 – volume: 10 start-page: R25 year: 2009 ident: 3244_CR53 publication-title: Genome Biol doi: 10.1186/gb-2009-10-3-r25 – ident: 3244_CR50 doi: 10.1101/2023.12.07.570547 – ident: 3244_CR26 – volume: 7 start-page: 11257 year: 2016 ident: 3244_CR40 publication-title: Nat Commun doi: 10.1038/ncomms11257 – reference: 38014029 - bioRxiv. 2023 Nov 17:2023.11.15.567129. doi: 10.1101/2023.11.15.567129 |
SSID | ssj0019426 ssj0017866 |
Score | 2.4703217 |
Snippet | Centrifuger is an efficient taxonomic classification method that compares sequencing reads against a microbial genome database. In Centrifuger, the... Abstract Centrifuger is an efficient taxonomic classification method that compares sequencing reads against a microbial genome database. In Centrifuger, the... |
SourceID | doaj proquest pubmed crossref |
SourceType | Open Website Aggregation Database Index Database Enrichment Source |
StartPage | 106 |
SubjectTerms | Chlamydia Classification Compact data structure Compression Data Compression - methods FM-index genome Genome, Bacterial Genome, Microbial Genomes memory Metagenomic Metagenomics Metagenomics - methods r-index Sequence Analysis, DNA - methods Software Taxonomy |
SummonAdditionalLinks | – databaseName: Health & Medical Collection dbid: 7X7 link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1Lb9QwELagCIkL4k1oQUbihqzGsePEvSBAVBUSnKi0N8vPaqXdpN3dHPj3zDgPxIE9JhlHUWY88409_oaQD040qUxCMSuVYhDxAnM6OFbFyoqkUyk8HnD-8VNdXcvvq3o1Lbjtp7LK2SdmRx16j2vk52CXuHkJiODT7R3DrlG4uzq10LhPHiB1GVp1s1oSLt60iFWmCy2r8agRFiDWWs0naFp1vgdHXmsG4YqVADAkk_9EqUzm_38EmiPR5RPyeIKQ9POo86fkXuyekYdjU8nfz8mQF2zXabiJuwu6gddvwJtRrB0fa1472ie6XWcGJngPsrRu454CeqUxE0rAeGq7QK33AzJJ0G082Cy29nSuvaYecTcWGmXdviDXl99-fb1iU3MF5lWpD4CqnfIt5165UAZrwTUG4XXi3grhFKAMUTvMx5oEGRlPtlaxSaGKQircuxEvyUnXd_E1odLzoIJWMDZJHoQNOmHObUPjPQDCgvD5Zxo_MY9jA4yNyRlIq8yoAAMKMFkBRhbk4zLmduTdOCr9BXW0SCJndr7R727MNAVN7SLk45CTa1vKZJ3jnsvIvSt5amWMBTmbNWymibw3f82uIO-XxzAFcV_FdrEfUAZp-hGJHZGByAI_talEQV6N1rN8rQAzlZA2vjn-AafkUZVNVLKqPiMnh90Q3wIiOrh32ez_AJs6BvA priority: 102 providerName: ProQuest |
Title | Centrifuger: lossless compression of microbial genomes for efficient and accurate metagenomic sequence classification |
URI | https://www.ncbi.nlm.nih.gov/pubmed/38664753 https://www.proquest.com/docview/3054212095 https://www.proquest.com/docview/3047944317 https://www.proquest.com/docview/3153655723 https://doaj.org/article/5be3381339a04fabb1c14e1cb01f84ee |
Volume | 25 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9QwELagCIkLKs-GlpWRuCGrcew4MbcualUhUaGKSisulp9opd0s6m4O_HtmnGRVDpQLlxySseN4xp5v4vFnQt470aQyCcWsVIqBxwvM6eBYFSsrkk6l8LjB-cuVuryRnxf14s5RX5gTNtADDx13WrsIURREUtqWMlnnuOcycu9KnloZI86-4POmYGpcP9DgeKYtMq063cJMXWsG_oiVgCAkk3-4oczW_3eImV3NxSF5OmJEeja07Rl5ELvn5PFwauSvF6TPf2SXqf8Rbz_SFVS_gumKYnL4kNTa0U2i62WmWIJ6kIZ1HbcU4CmNmTECylPbBWq975Eqgq7jzmaxpadTcjX1CKwxkygr7yW5uTj_9umSjacnMK9KvQPY7JRvOffKhTJYC3NfEF4n7q0QTgGMELXDgKtJEHLxZGsVmxSqKKTCxRnxihx0my4eESo9DypoBWWT5EHYoBMG1TY03gPiKwifOtP4kVocT7hYmRxitMoMCjCgAJMVYGRBPuzL_ByINe6VnqOO9pJIip1vgKmY0VTMv0ylICeThs04UrcG3oSL4oA0C_Ju_xjGGC6c2C5uepRBHn6EWvfIgOuATm0qUZDXg_XsWytapSTEhW_-x1cckydVNmTJqvqEHOxu-_gWgNHOzcjDZtHMyKP5-dXX61keEXC9nn__DbNeDwo |
linkProvider | Directory of Open Access Journals |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LbxMxELZKEYIL4k2ggJHghKyu115vFgkhXlVKH6dWys34WUVKNiXJCvVP8RuZ8T4QB3LrMZuxtfKMZ75vPZ4h5I0VZcyiUMxIpRhEPM9s5S3LQ25ErGImHF5wPjlVk3P5fVpMd8jv_i4MplX2PjE5ar90-I18H-wSDy8BEXy8_MmwaxServYtNFqzOApXv4CyrT8cfgX9vs3zg29nXyas6yrAnMqqDcBJq9yYc6esz7wx4BO8cFXkzghhFYRXUVgkImUEKsKjKVQoo8-DkAoPLQTMe4PchMCbIdkrpwPB4-UYsVH3o5J5e7UJEx6LSvU3dsZqfw2Bo6gYhEeWAaCRTP4TFVPzgP8j3hT5Du6Rux1kpZ9aG7tPdkL9gNxqm1hePSRN-kA8i81FWL2nc5h-Dt6TYq56m2Nb02Wki1mq-ATzYFXYRVhTQMs0pAIWMJ6a2lPjXIOVK-gibEwSmzna53pThzgfE5uSLT0i59ey7I_Jbr2sw1NCpeNe-UrB2Ci5F8ZXETm-8aVzAEBHhPeLqV1X6Rwbbsx1YjxjpVsFaFCATgrQckTeDWMu2zofW6U_o44GSazRnR4sVxe62_K6sAH4PxeiMpmMxlruuAzc2YzHsQxhRPZ6DevOcaz1XzMfkdfD37Dl8RzH1GHZoAy2BUDkt0UGIhksapmLEXnSWs_wtgLMVAJNfbb9BV6R25Ozk2N9fHh69JzcyZO5SpYXe2R3s2rCC0BjG_sybQFKflz3nvsDwAVC0Q |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Centrifuger%3A+lossless+compression+of+microbial+genomes+for+efficient+and+accurate+metagenomic+sequence+classification&rft.jtitle=Genome+biology&rft.au=Song%2C+Li&rft.au=Langmead%2C+Ben&rft.date=2024-04-25&rft.issn=1474-760X&rft.volume=25&rft.issue=1+p.106-106&rft.spage=106&rft.epage=106&rft_id=info:doi/10.1186%2Fs13059-024-03244-4&rft.externalDBID=NO_FULL_TEXT |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1474-760X&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1474-760X&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1474-760X&client=summon |