BWA-MEME: BWA-MEM emulated with a machine learning approach
Abstract Motivation The growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is one of the major performance bottlenecks. The key challenge in the seeding phase is searching for exact matches of substrings of short reads i...
Saved in:
Published in | Bioinformatics (Oxford, England) Vol. 38; no. 9; pp. 2404 - 2413 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
England
Oxford University Press
28.04.2022
|
Online Access | Get full text |
Cover
Loading…
Abstract | Abstract
Motivation
The growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is one of the major performance bottlenecks. The key challenge in the seeding phase is searching for exact matches of substrings of short reads in the reference DNA sequence. Existing algorithms, however, present limitations in performance due to their frequent memory accesses.
Results
This article presents BWA-MEME, the first full-fledged short read alignment software that leverages learned indices for solving the exact match search problem for efficient seeding. BWA-MEME is a practical and efficient seeding algorithm based on a suffix array search algorithm that solves the challenges in utilizing learned indices for SMEM search which is extensively used in the seeding phase. Our evaluation shows that BWA-MEME achieves up to 3.45× speedup in seeding throughput over BWA-MEM2 by reducing the number of instructions by 4.60×, memory accesses by 8.77× and LLC misses by 2.21×, while ensuring the identical SAM output to BWA-MEM2.
Availability and implementation
The source code and test scripts are available for academic use at https://github.com/kaist-ina/BWA-MEME/.
Supplementary information
Supplementary data are available at Bioinformatics online. |
---|---|
AbstractList | The growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is one of the major performance bottlenecks. The key challenge in the seeding phase is searching for exact matches of substrings of short reads in the reference DNA sequence. Existing algorithms, however, present limitations in performance due to their frequent memory accesses.
This paper presents BWA-MEME, the first full-fledged short read alignment software that leverages learned indices for solving the exact match search problem for efficient seeding. BWA-MEME is a practical and efficient seeding algorithm based on a suffix array search algorithm that solves the challenges in utilizing learned indices for SMEM search which is extensively used in the seeding phase. Our evaluation shows that BWA-MEME achieves up to 3.45x speedup in seeding throughput over BWA-MEM2 by reducing the number of instructions by 4.60x, memory accesses by 8.77x, and LLC misses by 2.21x, while ensuring the identical SAM output to BWA-MEM2.
The source code and test scripts are available for academic use at https://github.com/kaist-ina/BWA-MEME/.
Supplementary data are available at Bioinformatics online. The growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is one of the major performance bottlenecks. The key challenge in the seeding phase is searching for exact matches of substrings of short reads in the reference DNA sequence. Existing algorithms, however, present limitations in performance due to their frequent memory accesses.MOTIVATIONThe growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is one of the major performance bottlenecks. The key challenge in the seeding phase is searching for exact matches of substrings of short reads in the reference DNA sequence. Existing algorithms, however, present limitations in performance due to their frequent memory accesses.This article presents BWA-MEME, the first full-fledged short read alignment software that leverages learned indices for solving the exact match search problem for efficient seeding. BWA-MEME is a practical and efficient seeding algorithm based on a suffix array search algorithm that solves the challenges in utilizing learned indices for SMEM search which is extensively used in the seeding phase. Our evaluation shows that BWA-MEME achieves up to 3.45× speedup in seeding throughput over BWA-MEM2 by reducing the number of instructions by 4.60×, memory accesses by 8.77× and LLC misses by 2.21×, while ensuring the identical SAM output to BWA-MEM2.RESULTSThis article presents BWA-MEME, the first full-fledged short read alignment software that leverages learned indices for solving the exact match search problem for efficient seeding. BWA-MEME is a practical and efficient seeding algorithm based on a suffix array search algorithm that solves the challenges in utilizing learned indices for SMEM search which is extensively used in the seeding phase. Our evaluation shows that BWA-MEME achieves up to 3.45× speedup in seeding throughput over BWA-MEM2 by reducing the number of instructions by 4.60×, memory accesses by 8.77× and LLC misses by 2.21×, while ensuring the identical SAM output to BWA-MEM2.The source code and test scripts are available for academic use at https://github.com/kaist-ina/BWA-MEME/.AVAILABILITY AND IMPLEMENTATIONThe source code and test scripts are available for academic use at https://github.com/kaist-ina/BWA-MEME/.Supplementary data are available at Bioinformatics online.SUPPLEMENTARY INFORMATIONSupplementary data are available at Bioinformatics online. Abstract Motivation The growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is one of the major performance bottlenecks. The key challenge in the seeding phase is searching for exact matches of substrings of short reads in the reference DNA sequence. Existing algorithms, however, present limitations in performance due to their frequent memory accesses. Results This article presents BWA-MEME, the first full-fledged short read alignment software that leverages learned indices for solving the exact match search problem for efficient seeding. BWA-MEME is a practical and efficient seeding algorithm based on a suffix array search algorithm that solves the challenges in utilizing learned indices for SMEM search which is extensively used in the seeding phase. Our evaluation shows that BWA-MEME achieves up to 3.45× speedup in seeding throughput over BWA-MEM2 by reducing the number of instructions by 4.60×, memory accesses by 8.77× and LLC misses by 2.21×, while ensuring the identical SAM output to BWA-MEM2. Availability and implementation The source code and test scripts are available for academic use at https://github.com/kaist-ina/BWA-MEME/. Supplementary information Supplementary data are available at Bioinformatics online. |
Author | Han, Dongsu Jung, Youngmok |
Author_xml | – sequence: 1 givenname: Youngmok orcidid: 0000-0002-2613-1442 surname: Jung fullname: Jung, Youngmok – sequence: 2 givenname: Dongsu orcidid: 0000-0001-6922-7244 surname: Han fullname: Han, Dongsu email: dhan.ee@kaist.ac.kr |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/35253835$$D View this record in MEDLINE/PubMed |
BookMark | eNqNkM1OwzAQhC1URFvgFaocuYTacewkwKVU5UdqxQXE0Vo7NjVKnBAnQrw9QU2R4AKnHa3m29HOFI1c5TRCM4LPCc7oXNrKOlM1JbRW-blsQRGaHKAJoTwJ45SQ0bfGdIym3r9ijBlm_AiNKYsYTSmboMvr50W4WW1WF8GgAl12BbQ6D95tuw0gKEFtrdNBoaFx1r0EUNdN1S9P0KGBwuvTYR6jp5vV4_IuXD_c3i8X61BRRtswzpjJOWM5AEtMxqUGTYxSuaHGkF5pZWQstQKpeIQTZrJIZRDLLI2AS6DH6Gx3t49967RvRWm90kUBTledFxGnPE0wzpLeOhusnSx1LurGltB8iP3DveFqZ1BN5X2jjVC27TusXNuALQTB4qtf8bNfMfTb4_wXvk_4EyQ7sOrq_zKfKZuYqw |
CitedBy_id | crossref_primary_10_1186_s40246_024_00684_8 crossref_primary_10_1038_s41589_022_01077_5 crossref_primary_10_1111_tpj_16767 crossref_primary_10_3389_fpls_2025_1528404 crossref_primary_10_1001_jamadermatol_2023_5362 crossref_primary_10_1038_s41598_023_47166_w crossref_primary_10_3390_ani14223252 crossref_primary_10_1038_s12276_024_01288_x crossref_primary_10_1128_msystems_00582_23 crossref_primary_10_1111_cns_14815 crossref_primary_10_3390_cimb45120608 crossref_primary_10_3389_fgene_2024_1379784 crossref_primary_10_7759_cureus_58449 crossref_primary_10_1016_j_jwpe_2023_103577 crossref_primary_10_1016_j_isci_2023_106846 crossref_primary_10_1186_s40168_023_01635_6 crossref_primary_10_1038_s41598_024_76685_3 crossref_primary_10_3390_ijms252010872 crossref_primary_10_3390_horticulturae10111153 crossref_primary_10_7759_cureus_71118 crossref_primary_10_1093_gigascience_giae025 crossref_primary_10_1186_s12870_024_05978_6 crossref_primary_10_3390_microorganisms12050999 crossref_primary_10_1007_s10126_023_10248_x crossref_primary_10_1093_g3journal_jkae112 crossref_primary_10_1186_s12870_023_04543_x crossref_primary_10_3390_ani15010077 crossref_primary_10_1038_s41597_024_03651_z crossref_primary_10_1099_mgen_0_001328 crossref_primary_10_1109_TBCAS_2023_3348152 crossref_primary_10_3390_ani15040603 crossref_primary_10_3390_plants14050646 crossref_primary_10_7717_peerj_18969 crossref_primary_10_1038_s41598_024_70018_0 crossref_primary_10_3390_microorganisms12112190 crossref_primary_10_1016_j_aquaculture_2023_740079 crossref_primary_10_1093_gigascience_giae099 crossref_primary_10_1371_journal_pgen_1011477 crossref_primary_10_1016_j_iotech_2024_101030 crossref_primary_10_3389_fmed_2023_1211888 crossref_primary_10_1111_1462_2920_70018 crossref_primary_10_3389_fendo_2022_974518 crossref_primary_10_1080_00071668_2024_2367228 crossref_primary_10_3390_plants12091883 crossref_primary_10_1038_s41597_024_03911_y crossref_primary_10_1016_j_scienta_2025_113988 crossref_primary_10_1002_advs_202501772 crossref_primary_10_1093_bioinformatics_btae100 crossref_primary_10_3389_fncel_2024_1421342 crossref_primary_10_1016_j_tube_2024_102572 crossref_primary_10_1007_s00253_024_13053_1 crossref_primary_10_1016_j_fochms_2024_100238 crossref_primary_10_1016_j_ijbiomac_2025_141901 crossref_primary_10_5713_ab_23_0424 crossref_primary_10_1038_s41423_024_01157_7 crossref_primary_10_3390_agronomy14122768 crossref_primary_10_1007_s10592_023_01575_6 crossref_primary_10_1002_jmv_29187 crossref_primary_10_5586_asbp_190172 crossref_primary_10_1111_jipb_13782 crossref_primary_10_1016_j_vetmic_2025_110428 crossref_primary_10_3724_abbs_2023241 crossref_primary_10_1038_s41598_024_69918_y crossref_primary_10_1186_s12866_024_03352_y crossref_primary_10_3389_fgene_2024_1302554 crossref_primary_10_1007_s11427_023_2694_5 crossref_primary_10_1016_j_ijporl_2025_112230 crossref_primary_10_3390_microorganisms12081736 crossref_primary_10_3390_ijms26062733 crossref_primary_10_1016_j_gecco_2023_e02414 crossref_primary_10_3390_genes16020179 crossref_primary_10_1038_s41597_025_04480_4 crossref_primary_10_1136_bmjpo_2023_001930 crossref_primary_10_1186_s12870_022_03982_2 crossref_primary_10_1094_PDIS_10_22_2322_A crossref_primary_10_1016_j_scienta_2022_111689 crossref_primary_10_1038_s41467_024_49370_2 crossref_primary_10_5010_JPB_2025_52_003_016 crossref_primary_10_3390_ijms25115689 crossref_primary_10_1094_PDIS_02_24_0360_RE crossref_primary_10_1049_syb2_12104 crossref_primary_10_3390_ijms252212030 crossref_primary_10_3390_ph16010105 crossref_primary_10_3390_jof8101088 crossref_primary_10_1016_j_psj_2025_104769 crossref_primary_10_1139_gen_2023_0068 crossref_primary_10_1186_s40104_023_00984_4 crossref_primary_10_1186_s12870_024_04761_x crossref_primary_10_1016_j_indcrop_2024_118170 crossref_primary_10_3390_plants12244165 crossref_primary_10_7759_cureus_55556 crossref_primary_10_3389_fgene_2023_970465 crossref_primary_10_1002_bimj_202300278 crossref_primary_10_3390_ijms24087056 crossref_primary_10_1093_hr_uhae181 crossref_primary_10_1016_j_psj_2023_102721 crossref_primary_10_1038_s41467_024_52148_1 |
Cites_doi | 10.1145/3387514.3405886 10.1093/bioinformatics/bts414 10.1371/journal.pcbi.1005944 10.1145/3318464.3384706 10.1038/nbt.2835 10.1186/1471-2105-13-238 10.1093/bioinformatics/bts280 10.1093/bioinformatics/btp336 10.1093/bioinformatics/bts635 10.1093/bioinformatics/bty191 10.1038/nature15393 10.1101/gr.210500.116 10.1093/nar/gks408 10.1016/j.compbiolchem.2018.03.024 10.1145/3183713.3196909 10.1093/bioinformatics/btp324 10.1038/s41592-018-0051-x 10.1093/bioinformatics/bts276 10.1145/3409963.3410496 10.1093/bioinformatics/btu553 10.1093/nar/25.17.3389 10.1038/nmeth.1923 10.1016/j.softx.2021.100692 10.1093/bioinformatics/bty927 10.1093/bib/bbq015 10.1093/bioinformatics/btaa911 10.1093/bioinformatics/btw371 |
ContentType | Journal Article |
Copyright | The Author(s) 2022. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2022 The Author(s) (2022). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com. The Author(s) 2022. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. |
Copyright_xml | – notice: The Author(s) 2022. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2022 – notice: The Author(s) (2022). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com. – notice: The Author(s) 2022. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. |
DBID | AAYXX CITATION NPM 7X8 |
DOI | 10.1093/bioinformatics/btac137 |
DatabaseName | CrossRef PubMed MEDLINE - Academic |
DatabaseTitle | CrossRef PubMed MEDLINE - Academic |
DatabaseTitleList | PubMed MEDLINE - Academic |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Biology |
EISSN | 1367-4811 |
EndPage | 2413 |
ExternalDocumentID | 35253835 10_1093_bioinformatics_btac137 10.1093/bioinformatics/btac137 |
Genre | Journal Article |
GroupedDBID | --- -E4 -~X .-4 .2P .DC .GJ .I3 0R~ 1TH 23N 2WC 4.4 48X 53G 5GY 5WA 70D AAIJN AAIMJ AAJKP AAJQQ AAKPC AAMDB AAMVS AAOGV AAPQZ AAPXW AAUQX AAVAP AAVLN ABEFU ABEJV ABEUO ABGNP ABIXL ABNGD ABNKS ABPQP ABPTD ABQLI ABQTQ ABWST ABXVV ABZBJ ACGFS ACIWK ACPRK ACUFI ACUKT ACUXJ ACYTK ADBBV ADEYI ADEZT ADFTL ADGKP ADGZP ADHKW ADHZD ADMLS ADOCK ADPDF ADRDM ADRTK ADVEK ADYVW ADZTZ ADZXQ AECKG AEGPL AEJOX AEKKA AEKSI AELWJ AEMDU AENEX AENZO AEPUE AETBJ AEWNT AFFNX AFFZL AFGWE AFIYH AFOFC AFRAH AGINJ AGKEF AGQXC AGSYK AHMBA AHXPO AI. AIJHB AJEEA AJEUX AKHUL AKWXX ALMA_UNASSIGNED_HOLDINGS ALTZX ALUQC AMNDL APIBT APWMN AQDSO ARIXL ASPBG ATTQO AVWKF AXUDD AYOIW AZFZN AZVOD BAWUL BAYMD BHONS BQDIO BQUQU BSWAC BTQHN C1A C45 CAG CDBKE COF CS3 CZ4 DAKXR DIK DILTD DU5 D~K EBD EBS EE~ EJD ELUNK EMOBN F5P F9B FEDTE FHSFR FLIZI FLUFQ FOEOM FQBLK GAUVT GJXCC GROUPED_DOAJ GX1 H13 H5~ HAR HVGLF HW0 HZ~ IOX J21 JXSIZ KAQDR KOP KQ8 KSI KSN M-Z M49 MK~ ML0 N9A NGC NLBLG NMDNZ NOMLY NTWIH NU- NVLIB O0~ O9- OAWHX ODMLO OJQWA OK1 OVD OVEED O~Y P2P PAFKI PB- PEELM PQQKQ Q1. Q5Y R44 RD5 RIG RNI RNS ROL RPM RUSNO RW1 RXO RZF RZO SV3 TEORI TJP TLC TOX TR2 VH1 W8F WOQ X7H YAYTL YKOAZ YXANX ZGI ZKX ~91 ~KM AAYXX CITATION ADRIX AFXEN BCRHZ NPM ROX 7X8 |
ID | FETCH-LOGICAL-c353t-495fd655daa57f96beae1fccdf3ff11fcecfb4becabc62075f92c9a4b982a6ba3 |
IEDL.DBID | TOX |
ISSN | 1367-4803 1367-4811 |
IngestDate | Fri Jul 11 06:53:39 EDT 2025 Wed Feb 19 02:27:29 EST 2025 Tue Jul 01 02:33:58 EDT 2025 Thu Apr 24 23:10:34 EDT 2025 Wed Apr 02 07:06:57 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 9 |
Language | English |
License | This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model The Author(s) (2022). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com. |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c353t-495fd655daa57f96beae1fccdf3ff11fcecfb4becabc62075f92c9a4b982a6ba3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ORCID | 0000-0001-6922-7244 0000-0002-2613-1442 |
PMID | 35253835 |
PQID | 2636870097 |
PQPubID | 23479 |
PageCount | 10 |
ParticipantIDs | proquest_miscellaneous_2636870097 pubmed_primary_35253835 crossref_citationtrail_10_1093_bioinformatics_btac137 crossref_primary_10_1093_bioinformatics_btac137 oup_primary_10_1093_bioinformatics_btac137 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2022-04-28 |
PublicationDateYYYYMMDD | 2022-04-28 |
PublicationDate_xml | – month: 04 year: 2022 text: 2022-04-28 day: 28 |
PublicationDecade | 2020 |
PublicationPlace | England |
PublicationPlace_xml | – name: England |
PublicationTitle | Bioinformatics (Oxford, England) |
PublicationTitleAlternate | Bioinformatics |
PublicationYear | 2022 |
Publisher | Oxford University Press |
Publisher_xml | – name: Oxford University Press |
References | Eberle (2023041402565316400_) 2017; 27 Liu (2023041402565316400_) 2016; 32 Vasimuddin (2023041402565316400_) 2019 Ferragina (2023041402565316400_) 2001 Li (2023041402565316400_) 2010; 11 Marcus (2023041402565316400_) 2020 Ho (2023041402565316400_) 2019 Kim (2023041402565316400_) 2018; 15 Marçais (2023041402565316400_) 2018; 14 Altschul (2023041402565316400_) 1997; 25 Ho (2023041402565316400_) 2021 Tárraga (2023041402565316400_) 2014; 30 Kirsche (2023041402565316400_) 2021; 37 Li (2023041402565316400_) 2012; 28 Wang (2023041402565316400_) 2020 Ahmed (2023041402565316400_) 2015 Langmead (2023041402565316400_) 2012; 9 Auton (2023041402565316400_) 2015; 526 Li (2023041402565316400_) 2018; 34 Dobin (2023041402565316400_) 2013; 29 Subramaniyan (2023041402565316400_) 2021 Kent (2023041402565316400_) 2002; 12 Liu (2023041402565316400_) 2012; 28 Deorowicz (2023041402565316400_) 2019; 35 Zook (2023041402565316400_) 2014; 32 Kraska (2023041402565316400_) 2018 Rashelbach (2023041402565316400_) 2020 Houtgast (2023041402565316400_) 2018; 75 Kipf (2023041402565316400_) 2020; 14 Deorowicz (2023041402565316400_) 2021; 14 Li (2023041402565316400_) 2013 Vyverman (2023041402565316400_) 2012; 40 2023041402565316400 Chaisson (2023041402565316400_) 2012; 13 Li (2023041402565316400_) 2009; 25 |
References_xml | – start-page: 542 volume-title: Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication year: 2020 ident: 2023041402565316400_ doi: 10.1145/3387514.3405886 – start-page: 314 year: 2019 ident: 2023041402565316400_ – volume: 28 start-page: i318 year: 2012 ident: 2023041402565316400_ article-title: Long read alignment based on maximal exact match seeds publication-title: Bioinformatics doi: 10.1093/bioinformatics/bts414 – volume: 14 start-page: e1005944 year: 2018 ident: 2023041402565316400_ article-title: Mummer4: a fast and versatile genome alignment system publication-title: PLoS Comput. Biol doi: 10.1371/journal.pcbi.1005944 – start-page: 2789 volume-title: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data year: 2020 ident: 2023041402565316400_ doi: 10.1145/3318464.3384706 – volume: 32 start-page: 246 year: 2014 ident: 2023041402565316400_ article-title: Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls publication-title: Nat. Biotechnol doi: 10.1038/nbt.2835 – start-page: 388 year: 2021 ident: 2023041402565316400_ – volume: 13 start-page: 1 year: 2012 ident: 2023041402565316400_ article-title: Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-13-238 – volume: 28 start-page: 1838 year: 2012 ident: 2023041402565316400_ article-title: Exploring single-sample SNP and indel calling with whole-genome de novo assembly publication-title: Bioinformatics doi: 10.1093/bioinformatics/bts280 – start-page: 269 year: 2001 ident: 2023041402565316400_ article-title: An experimental study of an opportunistic index – volume: 25 start-page: 1966 year: 2009 ident: 2023041402565316400_ article-title: Soap2: an improved ultrafast tool for short read alignment publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp336 – volume: 29 start-page: 15 year: 2013 ident: 2023041402565316400_ article-title: Star: ultrafast universal RNA-seq aligner publication-title: Bioinformatics doi: 10.1093/bioinformatics/bts635 – volume: 34 start-page: 3094 year: 2018 ident: 2023041402565316400_ article-title: Minimap2: pairwise alignment for nucleotide sequences publication-title: Bioinformatics doi: 10.1093/bioinformatics/bty191 – ident: 2023041402565316400_ – volume: 526 start-page: 68 year: 2015 ident: 2023041402565316400_ article-title: A global reference for human genetic variation publication-title: Nature doi: 10.1038/nature15393 – volume: 27 start-page: 157 year: 2017 ident: 2023041402565316400_ article-title: A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree publication-title: Genome Res doi: 10.1101/gr.210500.116 – volume: 40 start-page: 6993 year: 2012 ident: 2023041402565316400_ article-title: Prospects and limitations of full-text index structures in genome analysis publication-title: Nucleic Acids Res doi: 10.1093/nar/gks408 – volume: 75 start-page: 54 year: 2018 ident: 2023041402565316400_ article-title: Hardware acceleration of BWA-MEM genomic short read mapping for longer read lengths publication-title: Comput. Biol. Chem doi: 10.1016/j.compbiolchem.2018.03.024 – start-page: 489 volume-title: Proceedings of the 2018 International Conference on Management of Data year: 2018 ident: 2023041402565316400_ doi: 10.1145/3183713.3196909 – start-page: 240 year: 2015 ident: 2023041402565316400_ – volume: 25 start-page: 1754 year: 2009 ident: 2023041402565316400_ article-title: Fast and accurate short read alignment with burrows–wheeler transform publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp324 – volume: 15 start-page: 591 year: 2018 ident: 2023041402565316400_ article-title: Strelka2: fast and accurate calling of germline and somatic variants publication-title: Nat. Methods doi: 10.1038/s41592-018-0051-x – year: 2013 ident: 2023041402565316400_ article-title: Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM publication-title: arXiv – volume: 28 start-page: 1830 year: 2012 ident: 2023041402565316400_ article-title: CUSHAW: a cuda compatible short read aligner to large genomes based on the burrows–wheeler transform publication-title: Bioinformatics doi: 10.1093/bioinformatics/bts276 – start-page: 17 volume-title: Proceedings of the 11th ACM SIGOPS Asia-Pacific Workshop on Systems year: 2020 ident: 2023041402565316400_ doi: 10.1145/3409963.3410496 – volume: 30 start-page: 3396 year: 2014 ident: 2023041402565316400_ article-title: Acceleration of short and long DNA read mapping without loss of accuracy using suffix array publication-title: Bioinformatics doi: 10.1093/bioinformatics/btu553 – volume: 25 start-page: 3389 year: 1997 ident: 2023041402565316400_ article-title: Gapped blast and psi-blast: a new generation of protein database search programs publication-title: Nucleic Acids Res doi: 10.1093/nar/25.17.3389 – start-page: 2020 year: 2021 ident: 2023041402565316400_ article-title: Lisa: learned indexes for sequence analysis publication-title: bioRxiv – volume: 9 start-page: 357 year: 2012 ident: 2023041402565316400_ article-title: Fast gapped-read alignment with bowtie 2 publication-title: Nat. Methods doi: 10.1038/nmeth.1923 – volume: 12 start-page: 656 year: 2002 ident: 2023041402565316400_ article-title: Blat—the blast-like alignment tool publication-title: Genome Res – volume: 14 start-page: 100692 year: 2021 ident: 2023041402565316400_ article-title: Whisper 2: indel-sensitive short read mapping publication-title: SoftwareX doi: 10.1016/j.softx.2021.100692 – volume: 35 start-page: 2043 year: 2019 ident: 2023041402565316400_ article-title: Whisper: read sorting allows robust mapping of DNA sequencing data publication-title: Bioinformatics doi: 10.1093/bioinformatics/bty927 – year: 2019 ident: 2023041402565316400_ article-title: Lisa: towards learned DNA sequence search publication-title: arXiv – volume: 11 start-page: 473 year: 2010 ident: 2023041402565316400_ article-title: A survey of sequence alignment algorithms for next-generation sequencing publication-title: Brief. Bioinf doi: 10.1093/bib/bbq015 – volume: 14 start-page: 1 year: 2020 ident: 2023041402565316400_ article-title: Sosd: a benchmark for learned indexes publication-title: NeurIPS Workshop Mach. Learn. Syst – volume: 37 start-page: 744 year: 2021 ident: 2023041402565316400_ article-title: Sapling: accelerating suffix array queries with learned data models publication-title: Bioinformatics doi: 10.1093/bioinformatics/btaa911 – volume: 32 start-page: 3224 year: 2016 ident: 2023041402565316400_ article-title: DEBGA: read alignment with de Bruijn graph-based seed and extension publication-title: Bioinformatics doi: 10.1093/bioinformatics/btw371 |
SSID | ssj0005056 |
Score | 2.677145 |
Snippet | Abstract
Motivation
The growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is... The growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is one of the major... |
SourceID | proquest pubmed crossref oup |
SourceType | Aggregation Database Index Database Enrichment Source Publisher |
StartPage | 2404 |
Title | BWA-MEME: BWA-MEM emulated with a machine learning approach |
URI | https://www.ncbi.nlm.nih.gov/pubmed/35253835 https://www.proquest.com/docview/2636870097 |
Volume | 38 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JSwMxFA6lIHgRd-tGBE9C6Ey2yeipSksRqpcWexuyimCnotOD_95klkoVUeeUQ5KB9zLzXvLyfR8A54xLjqmjKGHUIModQ6mNKdLaO5vi2DAdDvRHd3w4obdTNm2BuMHCfC3hp6SrnuY1iWggLu6qQuqYBPy4j8SBLX98P_281BGVeq2BhwxREZEGE_zjNCvhaAXi9i3TLCPOYBNs1Kki7FW-3QItm2-DtUo88n0HXF0_9NCoP-pfwroF7SyIcVkDw-kqlHBW3pS0sJaGeIQNg_gumAz645shqqUQkCaMFMhvY5zhjBkpWeJSrqy0sdPaOOJc7FtWO0W9P6TSHPs0wKVYp5KqVGDJlSR7oJ3Pc3sAoEks48bHZW4l9Y8UseWKcikSLJgWHcAai2S65gkPchXPWVWvJtmqJbPakh3QXY57qZgyfh1x4Q3-585njV8y_wWEsobM7XzxlmFOuP_rRKnvs185bDlnIHv1e3B2-J9XHYF1HEAOEUVYHIN28bqwJz71KNRpudo-AN6Y2x8 |
linkProvider | Oxford University Press |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=BWA-MEME%3A+BWA-MEM+emulated+with+a+machine+learning+approach&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Jung%2C+Youngmok&rft.au=Han%2C+Dongsu&rft.date=2022-04-28&rft.issn=1367-4811&rft.eissn=1367-4811&rft.volume=38&rft.issue=9&rft.spage=2404&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbtac137&rft.externalDBID=NO_FULL_TEXT |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon |