An Exact Matching Method for 16S rRNA Taxonomy Classification

One popular approach to taxonomy classification in the microbiome utilizes 16S ribosomal RNA sequences. The main challenge is that 16S rRNA sequences could be almost identical in closely related species, and it is difficult to distinguish them at the species level. Recent approaches are able to achi...

Full description

Saved in:
Bibliographic Details
Published inJournal of computational biology Vol. 32; no. 8; pp. 753 - 760
Main Author Sze, Sing-Hoi
Format Journal Article
LanguageEnglish
Published United States Mary Ann Liebert, Inc., publishers 01.08.2025
Subjects
Online AccessGet full text
ISSN1557-8666
1557-8666
DOI10.1089/cmb.2024.0615

Cover

Abstract One popular approach to taxonomy classification in the microbiome utilizes 16S ribosomal RNA sequences. The main challenge is that 16S rRNA sequences could be almost identical in closely related species, and it is difficult to distinguish them at the species level. Recent approaches are able to achieve almost single nucleotide resolution by constructing an error model of the reads. We develop an exact matching algorithm to utilize the single nucleotide resolution directly. We show that our algorithm is able to obtain improved accuracy in recent samples of mock communities and in samples of high compositional complexity when compared to existing algorithms. A software program implementing this algorithm is available at http://faculty.cse.tamu.edu/shsze/kmpmatch .
AbstractList One popular approach to taxonomy classification in the microbiome utilizes 16S ribosomal RNA sequences. The main challenge is that 16S rRNA sequences could be almost identical in closely related species, and it is difficult to distinguish them at the species level. Recent approaches are able to achieve almost single nucleotide resolution by constructing an error model of the reads. We develop an exact matching algorithm to utilize the single nucleotide resolution directly. We show that our algorithm is able to obtain improved accuracy in recent samples of mock communities and in samples of high compositional complexity when compared to existing algorithms. A software program implementing this algorithm is available at http://faculty.cse.tamu.edu/shsze/kmpmatch .
One popular approach to taxonomy classification in the microbiome utilizes 16S ribosomal RNA sequences. The main challenge is that 16S rRNA sequences could be almost identical in closely related species, and it is difficult to distinguish them at the species level. Recent approaches are able to achieve almost single nucleotide resolution by constructing an error model of the reads. We develop an exact matching algorithm to utilize the single nucleotide resolution directly. We show that our algorithm is able to obtain improved accuracy in recent samples of mock communities and in samples of high compositional complexity when compared to existing algorithms. A software program implementing this algorithm is available at http://faculty.cse.tamu.edu/shsze/kmpmatch.One popular approach to taxonomy classification in the microbiome utilizes 16S ribosomal RNA sequences. The main challenge is that 16S rRNA sequences could be almost identical in closely related species, and it is difficult to distinguish them at the species level. Recent approaches are able to achieve almost single nucleotide resolution by constructing an error model of the reads. We develop an exact matching algorithm to utilize the single nucleotide resolution directly. We show that our algorithm is able to obtain improved accuracy in recent samples of mock communities and in samples of high compositional complexity when compared to existing algorithms. A software program implementing this algorithm is available at http://faculty.cse.tamu.edu/shsze/kmpmatch.
One popular approach to taxonomy classification in the microbiome utilizes 16S ribosomal RNA sequences. The main challenge is that 16S rRNA sequences could be almost identical in closely related species, and it is difficult to distinguish them at the species level. Recent approaches are able to achieve almost single nucleotide resolution by constructing an error model of the reads. We develop an exact matching algorithm to utilize the single nucleotide resolution directly. We show that our algorithm is able to obtain improved accuracy in recent samples of mock communities and in samples of high compositional complexity when compared to existing algorithms. A software program implementing this algorithm is available at http://faculty.cse.tamu.edu/shsze/kmpmatch.
Author Sze, Sing-Hoi
Author_xml – sequence: 1
  givenname: Sing-Hoi
  orcidid: 0009-0005-1173-9454
  surname: Sze
  fullname: Sze, Sing-Hoi
BackLink https://www.ncbi.nlm.nih.gov/pubmed/40485285$$D View this record in MEDLINE/PubMed
BookMark eNqF0E1Lw0AQBuBFKvZDj15lj15S9yO7TQ4eSmlVaBW0npfJZmMjyW7dTaH99ya0ijdhYObwzMC8Q9SzzhqErikZU5Kkd7rOxoyweEwkFWdoQIWYRImUsvdn7qNhCJ-EUC7J5AL1YxIngiVigO6nFs_3oBu8gkZvSvuBV6bZuBwXzmMq37B_fZ7iNeyddfUBzyoIoSxKDU3p7CU6L6AK5urUR-h9MV_PHqPly8PTbLqMNEtFE0EiaGFyApKJNNUxhYTkJGVSgmaUcqDMmBgmbYnYCAGxJFQSXkCWa5JxPkK3x7tb7752JjSqLoM2VQXWuF1QnFGZUil4R29OdJfVJldbX9bgD-rn5RZER6C9C8Gb4pdQorpIVRup6iJVXaSt50ffGbC2Kk1mfPPP1jftLHdC
Cites_doi 10.1128/mSystems.00062-16
10.1038/s41587-019-0209-9
10.1128/mSystems.00191-16
10.1093/nar/gkm864
10.1093/nar/gkv180
10.1093/nar/19.suppl.2017
10.1038/nmeth.f.303
10.1186/s12864-015-1419-2
10.1016/S0022-2836(05)80360-2
10.1128/mSphere.01202-20
10.1128/AEM.03006-05
10.1038/nmeth.3869
10.1093/bioinformatics/btq619
10.1186/gb-2014-15-3-r46
ContentType Journal Article
Copyright 2025, Mary Ann Liebert, Inc., publishers
Copyright_xml – notice: 2025, Mary Ann Liebert, Inc., publishers
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7X8
DOI 10.1089/cmb.2024.0615
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList
MEDLINE - Academic
MEDLINE
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
Mathematics
EISSN 1557-8666
EndPage 760
ExternalDocumentID 40485285
10_1089_cmb_2024_0615
Genre Journal Article
GroupedDBID ---
0R~
29K
4.4
53G
5GY
ABBKN
ACGFO
ADBBV
AENEX
AFOSN
ALMA_UNASSIGNED_HOLDINGS
BAWUL
BNQNF
CS3
D-I
DIK
DU5
EBS
F5P
IAO
IHR
IM4
MV1
NQHIM
O9-
P2P
RML
RNS
TN5
TR2
UE5
AAYXX
CITATION
34G
39C
ABEFU
AI.
CAG
CGR
COF
CUY
CVF
ECM
EIF
EJD
IER
IGS
ITC
NPM
R.V
RIG
RMSOB
VH1
7X8
SCNPE
ID FETCH-LOGICAL-c295t-a851fed0a62599c41a80d09266ac2113a12ee4a74a754e55a4601603fabdc0b33
ISSN 1557-8666
IngestDate Fri Sep 05 15:55:42 EDT 2025
Fri Aug 01 03:41:23 EDT 2025
Thu Aug 07 06:30:50 EDT 2025
Thu Jul 31 06:40:19 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 8
Keywords 16S rRNA
microbiome
taxonomy classification
Language English
License https://www.liebertpub.com/nv/resources-tools/text-and-data-mining-policy/121
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c295t-a851fed0a62599c41a80d09266ac2113a12ee4a74a754e55a4601603fabdc0b33
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ORCID 0009-0005-1173-9454
PMID 40485285
PQID 3216916533
PQPubID 23479
PageCount 8
ParticipantIDs proquest_miscellaneous_3216916533
pubmed_primary_40485285
crossref_primary_10_1089_cmb_2024_0615
maryannliebert_primary_10_1089_cmb_2024_0615
PublicationCentury 2000
PublicationDate 2025-08-01
PublicationDateYYYYMMDD 2025-08-01
PublicationDate_xml – month: 08
  year: 2025
  text: 2025-08-01
  day: 01
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Journal of computational biology
PublicationTitleAlternate J Comput Biol
PublicationYear 2025
Publisher Mary Ann Liebert, Inc., publishers
Publisher_xml – name: Mary Ann Liebert, Inc., publishers
References B10
B11
B12
B13
B14
B15
B1
B2
B3
B4
B5
B6
B7
Cormen TH (B8) 1990
B9
References_xml – ident: B4
  doi: 10.1128/mSystems.00062-16
– ident: B5
  doi: 10.1038/s41587-019-0209-9
– ident: B3
  doi: 10.1128/mSystems.00191-16
– ident: B13
  doi: 10.1093/nar/gkm864
– ident: B10
  doi: 10.1093/nar/gkv180
– ident: B11
  doi: 10.1093/nar/19.suppl.2017
– ident: B7
  doi: 10.1038/nmeth.f.303
– ident: B12
  doi: 10.1186/s12864-015-1419-2
– ident: B2
  doi: 10.1016/S0022-2836(05)80360-2
– ident: B1
  doi: 10.1128/mSphere.01202-20
– volume-title: Introduction to Algorithms
  year: 1990
  ident: B8
– ident: B9
  doi: 10.1128/AEM.03006-05
– ident: B6
  doi: 10.1038/nmeth.3869
– ident: B14
  doi: 10.1093/bioinformatics/btq619
– ident: B15
  doi: 10.1186/gb-2014-15-3-r46
SSID ssj0013607
Score 2.4394548
Snippet One popular approach to taxonomy classification in the microbiome utilizes 16S ribosomal RNA sequences. The main challenge is that 16S rRNA sequences could be...
SourceID proquest
pubmed
crossref
maryannliebert
SourceType Aggregation Database
Index Database
Publisher
StartPage 753
SubjectTerms Algorithms
Bacteria - classification
Bacteria - genetics
Computational Biology - methods
Humans
Microbiota - genetics
Phylogeny
Preface
RNA, Ribosomal, 16S - classification
RNA, Ribosomal, 16S - genetics
Software
Title An Exact Matching Method for 16S rRNA Taxonomy Classification
URI https://www.liebertpub.com/doi/abs/10.1089/cmb.2024.0615
https://www.ncbi.nlm.nih.gov/pubmed/40485285
https://www.proquest.com/docview/3216916533
Volume 32
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ1ba9swFIDF2lFoYWO9bMsuRYPSl9StI1mO9BhGStY2GbQO5E1IsgR7mFPaFLr--h3Jt4V0rB0EE4Qtw_mMdM7RuSB0kNMec1TQyDkKBoowKoJ9x0Q0Vlw454h2PlF4PElH0-RsxmZt_n3ILlnoY_PwaF7J_1CFMeDqs2SfQbaZFAbgP_CFKxCG65MYD4ru8N4nOY5hQQ2upHFoCB1iB3vpVffmcjLoZuo-ZC6UDTB9aFBLY1UtNaHNQ-0irGo0NX6Yh9JjDa-KRvMff7oMCGsC1ppVjsHWlKZVDerVsZV1Nea-LKn5qcGiJsmx14PaDaQ-NJ98l6fTiwuZDWfZGnpJ-v3y4PzbeXuuk4YE9uZlVdVTmP5kafIlLeGVT-JTRQFKuQ82_7stEHSC7A16XUkND0oy2-iFLXbQRtne89cO2ho3NXFvdxHQwoEWrmnhkhYGWhhoYU8L17TwMq09ND0dZl9HUdW8IjJEsEWkQJV1No-VNzCFSXqKx3ksQB9SBoxuqnrE2kT14ccSy5hK0tDy2ymdm1hT-hatF_PCvkdYx0QJTqjQsQMDNedGaKZznpu-tUzwDjqshSWvyxolMsQWcCFBqtJLVXqpdtDRsij_dfuXWtASFh1_kqQKO7-7lZT4GkspmAod9K4k0EyVwJ7ACGcfnvD0R7TZfp-f0Pri5s5-BiVvoffDd_Mbj99Pqw
linkProvider Flying Publisher
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+Exact+Matching+Method+for+16S+rRNA+Taxonomy+Classification&rft.jtitle=Journal+of+computational+biology&rft.au=Sze%2C+Sing-Hoi&rft.date=2025-08-01&rft.issn=1557-8666&rft.eissn=1557-8666&rft_id=info:doi/10.1089%2Fcmb.2024.0615&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1557-8666&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1557-8666&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1557-8666&client=summon