BWA-MEME: BWA-MEM emulated with a machine learning approach

Abstract Motivation The growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is one of the major performance bottlenecks. The key challenge in the seeding phase is searching for exact matches of substrings of short reads i...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics (Oxford, England) Vol. 38; no. 9; pp. 2404 - 2413
Main Authors Jung, Youngmok, Han, Dongsu
Format Journal Article
LanguageEnglish
Published England Oxford University Press 28.04.2022
Online AccessGet full text

Cover

Loading…
Abstract Abstract Motivation The growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is one of the major performance bottlenecks. The key challenge in the seeding phase is searching for exact matches of substrings of short reads in the reference DNA sequence. Existing algorithms, however, present limitations in performance due to their frequent memory accesses. Results This article presents BWA-MEME, the first full-fledged short read alignment software that leverages learned indices for solving the exact match search problem for efficient seeding. BWA-MEME is a practical and efficient seeding algorithm based on a suffix array search algorithm that solves the challenges in utilizing learned indices for SMEM search which is extensively used in the seeding phase. Our evaluation shows that BWA-MEME achieves up to 3.45× speedup in seeding throughput over BWA-MEM2 by reducing the number of instructions by 4.60×, memory accesses by 8.77× and LLC misses by 2.21×, while ensuring the identical SAM output to BWA-MEM2. Availability and implementation The source code and test scripts are available for academic use at https://github.com/kaist-ina/BWA-MEME/. Supplementary information Supplementary data are available at Bioinformatics online.
AbstractList The growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is one of the major performance bottlenecks. The key challenge in the seeding phase is searching for exact matches of substrings of short reads in the reference DNA sequence. Existing algorithms, however, present limitations in performance due to their frequent memory accesses. This paper presents BWA-MEME, the first full-fledged short read alignment software that leverages learned indices for solving the exact match search problem for efficient seeding. BWA-MEME is a practical and efficient seeding algorithm based on a suffix array search algorithm that solves the challenges in utilizing learned indices for SMEM search which is extensively used in the seeding phase. Our evaluation shows that BWA-MEME achieves up to 3.45x speedup in seeding throughput over BWA-MEM2 by reducing the number of instructions by 4.60x, memory accesses by 8.77x, and LLC misses by 2.21x, while ensuring the identical SAM output to BWA-MEM2. The source code and test scripts are available for academic use at https://github.com/kaist-ina/BWA-MEME/. Supplementary data are available at Bioinformatics online.
The growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is one of the major performance bottlenecks. The key challenge in the seeding phase is searching for exact matches of substrings of short reads in the reference DNA sequence. Existing algorithms, however, present limitations in performance due to their frequent memory accesses.MOTIVATIONThe growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is one of the major performance bottlenecks. The key challenge in the seeding phase is searching for exact matches of substrings of short reads in the reference DNA sequence. Existing algorithms, however, present limitations in performance due to their frequent memory accesses.This article presents BWA-MEME, the first full-fledged short read alignment software that leverages learned indices for solving the exact match search problem for efficient seeding. BWA-MEME is a practical and efficient seeding algorithm based on a suffix array search algorithm that solves the challenges in utilizing learned indices for SMEM search which is extensively used in the seeding phase. Our evaluation shows that BWA-MEME achieves up to 3.45× speedup in seeding throughput over BWA-MEM2 by reducing the number of instructions by 4.60×, memory accesses by 8.77× and LLC misses by 2.21×, while ensuring the identical SAM output to BWA-MEM2.RESULTSThis article presents BWA-MEME, the first full-fledged short read alignment software that leverages learned indices for solving the exact match search problem for efficient seeding. BWA-MEME is a practical and efficient seeding algorithm based on a suffix array search algorithm that solves the challenges in utilizing learned indices for SMEM search which is extensively used in the seeding phase. Our evaluation shows that BWA-MEME achieves up to 3.45× speedup in seeding throughput over BWA-MEM2 by reducing the number of instructions by 4.60×, memory accesses by 8.77× and LLC misses by 2.21×, while ensuring the identical SAM output to BWA-MEM2.The source code and test scripts are available for academic use at https://github.com/kaist-ina/BWA-MEME/.AVAILABILITY AND IMPLEMENTATIONThe source code and test scripts are available for academic use at https://github.com/kaist-ina/BWA-MEME/.Supplementary data are available at Bioinformatics online.SUPPLEMENTARY INFORMATIONSupplementary data are available at Bioinformatics online.
Abstract Motivation The growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is one of the major performance bottlenecks. The key challenge in the seeding phase is searching for exact matches of substrings of short reads in the reference DNA sequence. Existing algorithms, however, present limitations in performance due to their frequent memory accesses. Results This article presents BWA-MEME, the first full-fledged short read alignment software that leverages learned indices for solving the exact match search problem for efficient seeding. BWA-MEME is a practical and efficient seeding algorithm based on a suffix array search algorithm that solves the challenges in utilizing learned indices for SMEM search which is extensively used in the seeding phase. Our evaluation shows that BWA-MEME achieves up to 3.45× speedup in seeding throughput over BWA-MEM2 by reducing the number of instructions by 4.60×, memory accesses by 8.77× and LLC misses by 2.21×, while ensuring the identical SAM output to BWA-MEM2. Availability and implementation The source code and test scripts are available for academic use at https://github.com/kaist-ina/BWA-MEME/. Supplementary information Supplementary data are available at Bioinformatics online.
Author Han, Dongsu
Jung, Youngmok
Author_xml – sequence: 1
  givenname: Youngmok
  orcidid: 0000-0002-2613-1442
  surname: Jung
  fullname: Jung, Youngmok
– sequence: 2
  givenname: Dongsu
  orcidid: 0000-0001-6922-7244
  surname: Han
  fullname: Han, Dongsu
  email: dhan.ee@kaist.ac.kr
BackLink https://www.ncbi.nlm.nih.gov/pubmed/35253835$$D View this record in MEDLINE/PubMed
BookMark eNqNkM1OwzAQhC1URFvgFaocuYTacewkwKVU5UdqxQXE0Vo7NjVKnBAnQrw9QU2R4AKnHa3m29HOFI1c5TRCM4LPCc7oXNrKOlM1JbRW-blsQRGaHKAJoTwJ45SQ0bfGdIym3r9ijBlm_AiNKYsYTSmboMvr50W4WW1WF8GgAl12BbQ6D95tuw0gKEFtrdNBoaFx1r0EUNdN1S9P0KGBwuvTYR6jp5vV4_IuXD_c3i8X61BRRtswzpjJOWM5AEtMxqUGTYxSuaHGkF5pZWQstQKpeIQTZrJIZRDLLI2AS6DH6Gx3t49967RvRWm90kUBTledFxGnPE0wzpLeOhusnSx1LurGltB8iP3DveFqZ1BN5X2jjVC27TusXNuALQTB4qtf8bNfMfTb4_wXvk_4EyQ7sOrq_zKfKZuYqw
CitedBy_id crossref_primary_10_1186_s40246_024_00684_8
crossref_primary_10_1038_s41589_022_01077_5
crossref_primary_10_1111_tpj_16767
crossref_primary_10_3389_fpls_2025_1528404
crossref_primary_10_1001_jamadermatol_2023_5362
crossref_primary_10_1038_s41598_023_47166_w
crossref_primary_10_3390_ani14223252
crossref_primary_10_1038_s12276_024_01288_x
crossref_primary_10_1128_msystems_00582_23
crossref_primary_10_1111_cns_14815
crossref_primary_10_3390_cimb45120608
crossref_primary_10_3389_fgene_2024_1379784
crossref_primary_10_7759_cureus_58449
crossref_primary_10_1016_j_jwpe_2023_103577
crossref_primary_10_1016_j_isci_2023_106846
crossref_primary_10_1186_s40168_023_01635_6
crossref_primary_10_1038_s41598_024_76685_3
crossref_primary_10_3390_ijms252010872
crossref_primary_10_3390_horticulturae10111153
crossref_primary_10_7759_cureus_71118
crossref_primary_10_1093_gigascience_giae025
crossref_primary_10_1186_s12870_024_05978_6
crossref_primary_10_3390_microorganisms12050999
crossref_primary_10_1007_s10126_023_10248_x
crossref_primary_10_1093_g3journal_jkae112
crossref_primary_10_1186_s12870_023_04543_x
crossref_primary_10_3390_ani15010077
crossref_primary_10_1038_s41597_024_03651_z
crossref_primary_10_1099_mgen_0_001328
crossref_primary_10_1109_TBCAS_2023_3348152
crossref_primary_10_3390_ani15040603
crossref_primary_10_3390_plants14050646
crossref_primary_10_7717_peerj_18969
crossref_primary_10_1038_s41598_024_70018_0
crossref_primary_10_3390_microorganisms12112190
crossref_primary_10_1016_j_aquaculture_2023_740079
crossref_primary_10_1093_gigascience_giae099
crossref_primary_10_1371_journal_pgen_1011477
crossref_primary_10_1016_j_iotech_2024_101030
crossref_primary_10_3389_fmed_2023_1211888
crossref_primary_10_1111_1462_2920_70018
crossref_primary_10_3389_fendo_2022_974518
crossref_primary_10_1080_00071668_2024_2367228
crossref_primary_10_3390_plants12091883
crossref_primary_10_1038_s41597_024_03911_y
crossref_primary_10_1016_j_scienta_2025_113988
crossref_primary_10_1002_advs_202501772
crossref_primary_10_1093_bioinformatics_btae100
crossref_primary_10_3389_fncel_2024_1421342
crossref_primary_10_1016_j_tube_2024_102572
crossref_primary_10_1007_s00253_024_13053_1
crossref_primary_10_1016_j_fochms_2024_100238
crossref_primary_10_1016_j_ijbiomac_2025_141901
crossref_primary_10_5713_ab_23_0424
crossref_primary_10_1038_s41423_024_01157_7
crossref_primary_10_3390_agronomy14122768
crossref_primary_10_1007_s10592_023_01575_6
crossref_primary_10_1002_jmv_29187
crossref_primary_10_5586_asbp_190172
crossref_primary_10_1111_jipb_13782
crossref_primary_10_1016_j_vetmic_2025_110428
crossref_primary_10_3724_abbs_2023241
crossref_primary_10_1038_s41598_024_69918_y
crossref_primary_10_1186_s12866_024_03352_y
crossref_primary_10_3389_fgene_2024_1302554
crossref_primary_10_1007_s11427_023_2694_5
crossref_primary_10_1016_j_ijporl_2025_112230
crossref_primary_10_3390_microorganisms12081736
crossref_primary_10_3390_ijms26062733
crossref_primary_10_1016_j_gecco_2023_e02414
crossref_primary_10_3390_genes16020179
crossref_primary_10_1038_s41597_025_04480_4
crossref_primary_10_1136_bmjpo_2023_001930
crossref_primary_10_1186_s12870_022_03982_2
crossref_primary_10_1094_PDIS_10_22_2322_A
crossref_primary_10_1016_j_scienta_2022_111689
crossref_primary_10_1038_s41467_024_49370_2
crossref_primary_10_5010_JPB_2025_52_003_016
crossref_primary_10_3390_ijms25115689
crossref_primary_10_1094_PDIS_02_24_0360_RE
crossref_primary_10_1049_syb2_12104
crossref_primary_10_3390_ijms252212030
crossref_primary_10_3390_ph16010105
crossref_primary_10_3390_jof8101088
crossref_primary_10_1016_j_psj_2025_104769
crossref_primary_10_1139_gen_2023_0068
crossref_primary_10_1186_s40104_023_00984_4
crossref_primary_10_1186_s12870_024_04761_x
crossref_primary_10_1016_j_indcrop_2024_118170
crossref_primary_10_3390_plants12244165
crossref_primary_10_7759_cureus_55556
crossref_primary_10_3389_fgene_2023_970465
crossref_primary_10_1002_bimj_202300278
crossref_primary_10_3390_ijms24087056
crossref_primary_10_1093_hr_uhae181
crossref_primary_10_1016_j_psj_2023_102721
crossref_primary_10_1038_s41467_024_52148_1
Cites_doi 10.1145/3387514.3405886
10.1093/bioinformatics/bts414
10.1371/journal.pcbi.1005944
10.1145/3318464.3384706
10.1038/nbt.2835
10.1186/1471-2105-13-238
10.1093/bioinformatics/bts280
10.1093/bioinformatics/btp336
10.1093/bioinformatics/bts635
10.1093/bioinformatics/bty191
10.1038/nature15393
10.1101/gr.210500.116
10.1093/nar/gks408
10.1016/j.compbiolchem.2018.03.024
10.1145/3183713.3196909
10.1093/bioinformatics/btp324
10.1038/s41592-018-0051-x
10.1093/bioinformatics/bts276
10.1145/3409963.3410496
10.1093/bioinformatics/btu553
10.1093/nar/25.17.3389
10.1038/nmeth.1923
10.1016/j.softx.2021.100692
10.1093/bioinformatics/bty927
10.1093/bib/bbq015
10.1093/bioinformatics/btaa911
10.1093/bioinformatics/btw371
ContentType Journal Article
Copyright The Author(s) 2022. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2022
The Author(s) (2022). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
The Author(s) 2022. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Copyright_xml – notice: The Author(s) 2022. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2022
– notice: The Author(s) (2022). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
– notice: The Author(s) 2022. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
DBID AAYXX
CITATION
NPM
7X8
DOI 10.1093/bioinformatics/btac137
DatabaseName CrossRef
PubMed
MEDLINE - Academic
DatabaseTitle CrossRef
PubMed
MEDLINE - Academic
DatabaseTitleList PubMed
MEDLINE - Academic

Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1367-4811
EndPage 2413
ExternalDocumentID 35253835
10_1093_bioinformatics_btac137
10.1093/bioinformatics/btac137
Genre Journal Article
GroupedDBID ---
-E4
-~X
.-4
.2P
.DC
.GJ
.I3
0R~
1TH
23N
2WC
4.4
48X
53G
5GY
5WA
70D
AAIJN
AAIMJ
AAJKP
AAJQQ
AAKPC
AAMDB
AAMVS
AAOGV
AAPQZ
AAPXW
AAUQX
AAVAP
AAVLN
ABEFU
ABEJV
ABEUO
ABGNP
ABIXL
ABNGD
ABNKS
ABPQP
ABPTD
ABQLI
ABQTQ
ABWST
ABXVV
ABZBJ
ACGFS
ACIWK
ACPRK
ACUFI
ACUKT
ACUXJ
ACYTK
ADBBV
ADEYI
ADEZT
ADFTL
ADGKP
ADGZP
ADHKW
ADHZD
ADMLS
ADOCK
ADPDF
ADRDM
ADRTK
ADVEK
ADYVW
ADZTZ
ADZXQ
AECKG
AEGPL
AEJOX
AEKKA
AEKSI
AELWJ
AEMDU
AENEX
AENZO
AEPUE
AETBJ
AEWNT
AFFNX
AFFZL
AFGWE
AFIYH
AFOFC
AFRAH
AGINJ
AGKEF
AGQXC
AGSYK
AHMBA
AHXPO
AI.
AIJHB
AJEEA
AJEUX
AKHUL
AKWXX
ALMA_UNASSIGNED_HOLDINGS
ALTZX
ALUQC
AMNDL
APIBT
APWMN
AQDSO
ARIXL
ASPBG
ATTQO
AVWKF
AXUDD
AYOIW
AZFZN
AZVOD
BAWUL
BAYMD
BHONS
BQDIO
BQUQU
BSWAC
BTQHN
C1A
C45
CAG
CDBKE
COF
CS3
CZ4
DAKXR
DIK
DILTD
DU5
D~K
EBD
EBS
EE~
EJD
ELUNK
EMOBN
F5P
F9B
FEDTE
FHSFR
FLIZI
FLUFQ
FOEOM
FQBLK
GAUVT
GJXCC
GROUPED_DOAJ
GX1
H13
H5~
HAR
HVGLF
HW0
HZ~
IOX
J21
JXSIZ
KAQDR
KOP
KQ8
KSI
KSN
M-Z
M49
MK~
ML0
N9A
NGC
NLBLG
NMDNZ
NOMLY
NTWIH
NU-
NVLIB
O0~
O9-
OAWHX
ODMLO
OJQWA
OK1
OVD
OVEED
O~Y
P2P
PAFKI
PB-
PEELM
PQQKQ
Q1.
Q5Y
R44
RD5
RIG
RNI
RNS
ROL
RPM
RUSNO
RW1
RXO
RZF
RZO
SV3
TEORI
TJP
TLC
TOX
TR2
VH1
W8F
WOQ
X7H
YAYTL
YKOAZ
YXANX
ZGI
ZKX
~91
~KM
AAYXX
CITATION
ADRIX
AFXEN
BCRHZ
NPM
ROX
7X8
ID FETCH-LOGICAL-c353t-495fd655daa57f96beae1fccdf3ff11fcecfb4becabc62075f92c9a4b982a6ba3
IEDL.DBID TOX
ISSN 1367-4803
1367-4811
IngestDate Fri Jul 11 06:53:39 EDT 2025
Wed Feb 19 02:27:29 EST 2025
Tue Jul 01 02:33:58 EDT 2025
Thu Apr 24 23:10:34 EDT 2025
Wed Apr 02 07:06:57 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 9
Language English
License This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model
The Author(s) (2022). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c353t-495fd655daa57f96beae1fccdf3ff11fcecfb4becabc62075f92c9a4b982a6ba3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ORCID 0000-0001-6922-7244
0000-0002-2613-1442
PMID 35253835
PQID 2636870097
PQPubID 23479
PageCount 10
ParticipantIDs proquest_miscellaneous_2636870097
pubmed_primary_35253835
crossref_citationtrail_10_1093_bioinformatics_btac137
crossref_primary_10_1093_bioinformatics_btac137
oup_primary_10_1093_bioinformatics_btac137
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2022-04-28
PublicationDateYYYYMMDD 2022-04-28
PublicationDate_xml – month: 04
  year: 2022
  text: 2022-04-28
  day: 28
PublicationDecade 2020
PublicationPlace England
PublicationPlace_xml – name: England
PublicationTitle Bioinformatics (Oxford, England)
PublicationTitleAlternate Bioinformatics
PublicationYear 2022
Publisher Oxford University Press
Publisher_xml – name: Oxford University Press
References Eberle (2023041402565316400_) 2017; 27
Liu (2023041402565316400_) 2016; 32
Vasimuddin (2023041402565316400_) 2019
Ferragina (2023041402565316400_) 2001
Li (2023041402565316400_) 2010; 11
Marcus (2023041402565316400_) 2020
Ho (2023041402565316400_) 2019
Kim (2023041402565316400_) 2018; 15
Marçais (2023041402565316400_) 2018; 14
Altschul (2023041402565316400_) 1997; 25
Ho (2023041402565316400_) 2021
Tárraga (2023041402565316400_) 2014; 30
Kirsche (2023041402565316400_) 2021; 37
Li (2023041402565316400_) 2012; 28
Wang (2023041402565316400_) 2020
Ahmed (2023041402565316400_) 2015
Langmead (2023041402565316400_) 2012; 9
Auton (2023041402565316400_) 2015; 526
Li (2023041402565316400_) 2018; 34
Dobin (2023041402565316400_) 2013; 29
Subramaniyan (2023041402565316400_) 2021
Kent (2023041402565316400_) 2002; 12
Liu (2023041402565316400_) 2012; 28
Deorowicz (2023041402565316400_) 2019; 35
Zook (2023041402565316400_) 2014; 32
Kraska (2023041402565316400_) 2018
Rashelbach (2023041402565316400_) 2020
Houtgast (2023041402565316400_) 2018; 75
Kipf (2023041402565316400_) 2020; 14
Deorowicz (2023041402565316400_) 2021; 14
Li (2023041402565316400_) 2013
Vyverman (2023041402565316400_) 2012; 40
2023041402565316400
Chaisson (2023041402565316400_) 2012; 13
Li (2023041402565316400_) 2009; 25
References_xml – start-page: 542
  volume-title: Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication
  year: 2020
  ident: 2023041402565316400_
  doi: 10.1145/3387514.3405886
– start-page: 314
  year: 2019
  ident: 2023041402565316400_
– volume: 28
  start-page: i318
  year: 2012
  ident: 2023041402565316400_
  article-title: Long read alignment based on maximal exact match seeds
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bts414
– volume: 14
  start-page: e1005944
  year: 2018
  ident: 2023041402565316400_
  article-title: Mummer4: a fast and versatile genome alignment system
  publication-title: PLoS Comput. Biol
  doi: 10.1371/journal.pcbi.1005944
– start-page: 2789
  volume-title: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
  year: 2020
  ident: 2023041402565316400_
  doi: 10.1145/3318464.3384706
– volume: 32
  start-page: 246
  year: 2014
  ident: 2023041402565316400_
  article-title: Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls
  publication-title: Nat. Biotechnol
  doi: 10.1038/nbt.2835
– start-page: 388
  year: 2021
  ident: 2023041402565316400_
– volume: 13
  start-page: 1
  year: 2012
  ident: 2023041402565316400_
  article-title: Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-13-238
– volume: 28
  start-page: 1838
  year: 2012
  ident: 2023041402565316400_
  article-title: Exploring single-sample SNP and indel calling with whole-genome de novo assembly
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bts280
– start-page: 269
  year: 2001
  ident: 2023041402565316400_
  article-title: An experimental study of an opportunistic index
– volume: 25
  start-page: 1966
  year: 2009
  ident: 2023041402565316400_
  article-title: Soap2: an improved ultrafast tool for short read alignment
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btp336
– volume: 29
  start-page: 15
  year: 2013
  ident: 2023041402565316400_
  article-title: Star: ultrafast universal RNA-seq aligner
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bts635
– volume: 34
  start-page: 3094
  year: 2018
  ident: 2023041402565316400_
  article-title: Minimap2: pairwise alignment for nucleotide sequences
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bty191
– ident: 2023041402565316400_
– volume: 526
  start-page: 68
  year: 2015
  ident: 2023041402565316400_
  article-title: A global reference for human genetic variation
  publication-title: Nature
  doi: 10.1038/nature15393
– volume: 27
  start-page: 157
  year: 2017
  ident: 2023041402565316400_
  article-title: A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree
  publication-title: Genome Res
  doi: 10.1101/gr.210500.116
– volume: 40
  start-page: 6993
  year: 2012
  ident: 2023041402565316400_
  article-title: Prospects and limitations of full-text index structures in genome analysis
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gks408
– volume: 75
  start-page: 54
  year: 2018
  ident: 2023041402565316400_
  article-title: Hardware acceleration of BWA-MEM genomic short read mapping for longer read lengths
  publication-title: Comput. Biol. Chem
  doi: 10.1016/j.compbiolchem.2018.03.024
– start-page: 489
  volume-title: Proceedings of the 2018 International Conference on Management of Data
  year: 2018
  ident: 2023041402565316400_
  doi: 10.1145/3183713.3196909
– start-page: 240
  year: 2015
  ident: 2023041402565316400_
– volume: 25
  start-page: 1754
  year: 2009
  ident: 2023041402565316400_
  article-title: Fast and accurate short read alignment with burrows–wheeler transform
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btp324
– volume: 15
  start-page: 591
  year: 2018
  ident: 2023041402565316400_
  article-title: Strelka2: fast and accurate calling of germline and somatic variants
  publication-title: Nat. Methods
  doi: 10.1038/s41592-018-0051-x
– year: 2013
  ident: 2023041402565316400_
  article-title: Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM
  publication-title: arXiv
– volume: 28
  start-page: 1830
  year: 2012
  ident: 2023041402565316400_
  article-title: CUSHAW: a cuda compatible short read aligner to large genomes based on the burrows–wheeler transform
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bts276
– start-page: 17
  volume-title: Proceedings of the 11th ACM SIGOPS Asia-Pacific Workshop on Systems
  year: 2020
  ident: 2023041402565316400_
  doi: 10.1145/3409963.3410496
– volume: 30
  start-page: 3396
  year: 2014
  ident: 2023041402565316400_
  article-title: Acceleration of short and long DNA read mapping without loss of accuracy using suffix array
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btu553
– volume: 25
  start-page: 3389
  year: 1997
  ident: 2023041402565316400_
  article-title: Gapped blast and psi-blast: a new generation of protein database search programs
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/25.17.3389
– start-page: 2020
  year: 2021
  ident: 2023041402565316400_
  article-title: Lisa: learned indexes for sequence analysis
  publication-title: bioRxiv
– volume: 9
  start-page: 357
  year: 2012
  ident: 2023041402565316400_
  article-title: Fast gapped-read alignment with bowtie 2
  publication-title: Nat. Methods
  doi: 10.1038/nmeth.1923
– volume: 12
  start-page: 656
  year: 2002
  ident: 2023041402565316400_
  article-title: Blat—the blast-like alignment tool
  publication-title: Genome Res
– volume: 14
  start-page: 100692
  year: 2021
  ident: 2023041402565316400_
  article-title: Whisper 2: indel-sensitive short read mapping
  publication-title: SoftwareX
  doi: 10.1016/j.softx.2021.100692
– volume: 35
  start-page: 2043
  year: 2019
  ident: 2023041402565316400_
  article-title: Whisper: read sorting allows robust mapping of DNA sequencing data
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bty927
– year: 2019
  ident: 2023041402565316400_
  article-title: Lisa: towards learned DNA sequence search
  publication-title: arXiv
– volume: 11
  start-page: 473
  year: 2010
  ident: 2023041402565316400_
  article-title: A survey of sequence alignment algorithms for next-generation sequencing
  publication-title: Brief. Bioinf
  doi: 10.1093/bib/bbq015
– volume: 14
  start-page: 1
  year: 2020
  ident: 2023041402565316400_
  article-title: Sosd: a benchmark for learned indexes
  publication-title: NeurIPS Workshop Mach. Learn. Syst
– volume: 37
  start-page: 744
  year: 2021
  ident: 2023041402565316400_
  article-title: Sapling: accelerating suffix array queries with learned data models
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btaa911
– volume: 32
  start-page: 3224
  year: 2016
  ident: 2023041402565316400_
  article-title: DEBGA: read alignment with de Bruijn graph-based seed and extension
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btw371
SSID ssj0005056
Score 2.677145
Snippet Abstract Motivation The growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is...
The growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is one of the major...
SourceID proquest
pubmed
crossref
oup
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 2404
Title BWA-MEME: BWA-MEM emulated with a machine learning approach
URI https://www.ncbi.nlm.nih.gov/pubmed/35253835
https://www.proquest.com/docview/2636870097
Volume 38
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JSwMxFA6lIHgRd-tGBE9C6Ey2yeipSksRqpcWexuyimCnotOD_95klkoVUeeUQ5KB9zLzXvLyfR8A54xLjqmjKGHUIModQ6mNKdLaO5vi2DAdDvRHd3w4obdTNm2BuMHCfC3hp6SrnuY1iWggLu6qQuqYBPy4j8SBLX98P_281BGVeq2BhwxREZEGE_zjNCvhaAXi9i3TLCPOYBNs1Kki7FW-3QItm2-DtUo88n0HXF0_9NCoP-pfwroF7SyIcVkDw-kqlHBW3pS0sJaGeIQNg_gumAz645shqqUQkCaMFMhvY5zhjBkpWeJSrqy0sdPaOOJc7FtWO0W9P6TSHPs0wKVYp5KqVGDJlSR7oJ3Pc3sAoEks48bHZW4l9Y8UseWKcikSLJgWHcAai2S65gkPchXPWVWvJtmqJbPakh3QXY57qZgyfh1x4Q3-585njV8y_wWEsobM7XzxlmFOuP_rRKnvs185bDlnIHv1e3B2-J9XHYF1HEAOEUVYHIN28bqwJz71KNRpudo-AN6Y2x8
linkProvider Oxford University Press
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=BWA-MEME%3A+BWA-MEM+emulated+with+a+machine+learning+approach&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Jung%2C+Youngmok&rft.au=Han%2C+Dongsu&rft.date=2022-04-28&rft.issn=1367-4811&rft.eissn=1367-4811&rft.volume=38&rft.issue=9&rft.spage=2404&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbtac137&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon