Symphonizing pileup and full-alignment for deep learning-based long-read variant calling
Deep learning-based variant callers are becoming the standard and have achieved superior single nucleotide polymorphisms calling performance using long reads. Here we present Clair3, which leverages two major method categories: pileup calling handles most variant candidates with speed, and full-alig...
Saved in:
Published in | Nature Computational Science Vol. 2; no. 12; pp. 797 - 803 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
United States
Nature Publishing Group
01.12.2022
|
Subjects | |
Online Access | Get full text |
ISSN | 2662-8457 |
DOI | 10.1038/s43588-022-00387-x |
Cover
Loading…
Abstract | Deep learning-based variant callers are becoming the standard and have achieved superior single nucleotide polymorphisms calling performance using long reads. Here we present Clair3, which leverages two major method categories: pileup calling handles most variant candidates with speed, and full-alignment tackles complicated candidates to maximize precision and recall. Clair3 runs faster than any of the other state-of-the-art variant callers and demonstrates improved performance, especially at lower coverage. |
---|---|
AbstractList | Deep learning-based variant callers are becoming the standard and have achieved superior single nucleotide polymorphisms calling performance using long reads. Here we present Clair3, which leverages two major method categories: pileup calling handles most variant candidates with speed, and full-alignment tackles complicated candidates to maximize precision and recall. Clair3 runs faster than any of the other state-of-the-art variant callers and demonstrates improved performance, especially at lower coverage. Deep learning-based variant callers are becoming the standard and have achieved superior single nucleotide polymorphisms calling performance using long reads. Here we present Clair3, which leverages two major method categories: pileup calling handles most variant candidates with speed, and full-alignment tackles complicated candidates to maximize precision and recall. Clair3 runs faster than any of the other state-of-the-art variant callers and demonstrates improved performance, especially at lower coverage.Leveraging both the simple pileup input and full-alignment input, small variant calling using noisy long reads has improved speed and accuracy. |
Author | Luo, Ruibang Leung, Amy Wing-Sze Su, Junhao Lam, Tak-Wah Zheng, Zhenxian Li, Shumin |
Author_xml | – sequence: 1 givenname: Zhenxian orcidid: 0000-0002-6546-2324 surname: Zheng fullname: Zheng, Zhenxian organization: Department of Computer Science, The University of Hong Kong, Hong Kong, China – sequence: 2 givenname: Shumin surname: Li fullname: Li, Shumin organization: Department of Computer Science, The University of Hong Kong, Hong Kong, China – sequence: 3 givenname: Junhao orcidid: 0000-0002-8560-3999 surname: Su fullname: Su, Junhao organization: Department of Computer Science, The University of Hong Kong, Hong Kong, China – sequence: 4 givenname: Amy Wing-Sze surname: Leung fullname: Leung, Amy Wing-Sze organization: Department of Computer Science, The University of Hong Kong, Hong Kong, China – sequence: 5 givenname: Tak-Wah surname: Lam fullname: Lam, Tak-Wah organization: Department of Computer Science, The University of Hong Kong, Hong Kong, China – sequence: 6 givenname: Ruibang orcidid: 0000-0001-9711-6533 surname: Luo fullname: Luo, Ruibang email: rbluo@cs.hku.hk organization: Department of Computer Science, The University of Hong Kong, Hong Kong, China. rbluo@cs.hku.hk |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/38177392$$D View this record in MEDLINE/PubMed |
BookMark | eNo1kE9LxDAQxYMo7rruF_AgAc_RJJOk6VEW_8GCBxW8laRJ1y5pWtOtuH56A67M4c1782MG5gwdxz56hC4YvWYU9M0oQGpNKOeEZl-Q7yM050pxooUsZmg5jltKKZcMqIJTNAPNigJKPkfvL_tu-Ohj-9PGDR7a4KcBm-hwM4VATGg3sfNxh5s-Yef9gIM3KWaWWDN6h0Of2-SNw18mtSaTtQkhz8_RSWPC6JcHXaC3-7vX1SNZPz88rW7XpAZgO-KYkr62IFgtwZZUAJVUg_NNIS00iqkyJxYKAc7RXIxaoa3wombMKAoLdPW3d0j95-THXbXtpxTzyQo4l2WpJFeZujxQk-28q4bUdibtq_9HwC8VHWCJ |
CitedBy_id | crossref_primary_10_1186_s12864_024_11182_5 crossref_primary_10_3389_fendo_2024_1416433 crossref_primary_10_1002_ece3_70987 crossref_primary_10_1093_ve_veae073 crossref_primary_10_1093_bioinformatics_btae712 crossref_primary_10_1186_s13073_024_01391_8 crossref_primary_10_1186_s12859_023_05596_3 crossref_primary_10_1136_jmg_2024_110115 crossref_primary_10_1038_s41467_024_50079_5 crossref_primary_10_1093_gigascience_giaf007 crossref_primary_10_1016_j_xgen_2024_100674 crossref_primary_10_1128_jcm_01576_23 crossref_primary_10_22331_q_2024_12_11_1559 crossref_primary_10_1128_aem_01892_24 crossref_primary_10_1093_bib_bbae473 crossref_primary_10_1016_j_ajhg_2024_01_002 crossref_primary_10_1016_j_jmoldx_2024_12_003 crossref_primary_10_1128_mbio_03203_23 crossref_primary_10_1128_jcm_01083_24 crossref_primary_10_1093_g3journal_jkaf044 crossref_primary_10_1038_s41431_024_01599_7 crossref_primary_10_1016_j_ymthe_2024_11_025 crossref_primary_10_1093_g3journal_jkae113 crossref_primary_10_1038_s41586_023_06842_7 crossref_primary_10_1007_s12033_024_01213_7 crossref_primary_10_38001_ijlsb_1308355 crossref_primary_10_1093_bib_bbae269 crossref_primary_10_7554_eLife_98300 crossref_primary_10_1186_s12859_023_05434_6 crossref_primary_10_1016_j_fsigen_2024_103156 crossref_primary_10_1016_j_fsigen_2024_103154 crossref_primary_10_1093_bioinformatics_btae066 crossref_primary_10_3390_genes16020116 crossref_primary_10_1186_s13073_025_01448_2 crossref_primary_10_1007_s00239_023_10102_7 crossref_primary_10_1038_s41467_024_45688_z crossref_primary_10_1038_s41467_024_47349_7 crossref_primary_10_1186_s40104_023_00896_3 crossref_primary_10_1038_s41467_024_50159_6 crossref_primary_10_1093_jac_dkae060 crossref_primary_10_3389_fbioe_2024_1395659 crossref_primary_10_1038_s41594_024_01423_2 crossref_primary_10_5586_asbp_172516 crossref_primary_10_1038_s41598_025_85757_x crossref_primary_10_1016_j_fochms_2024_100236 crossref_primary_10_1038_s41598_023_42600_5 crossref_primary_10_1093_bioinformatics_btae744 crossref_primary_10_1016_j_jtha_2024_12_030 crossref_primary_10_1038_s41598_024_78270_0 crossref_primary_10_1186_s13059_023_02863_7 crossref_primary_10_1111_age_13332 crossref_primary_10_1128_spectrum_02082_24 crossref_primary_10_1002_ana_27155 crossref_primary_10_1038_s41467_024_44997_7 crossref_primary_10_1093_clinchem_hvad108 crossref_primary_10_1186_s40168_024_02026_1 crossref_primary_10_3389_fgene_2024_1435087 crossref_primary_10_3389_freae_2024_1362926 crossref_primary_10_1101_gr_278730_123 crossref_primary_10_1016_j_plabm_2024_e00423 crossref_primary_10_1016_j_tig_2024_07_001 crossref_primary_10_1038_s41467_023_39784_9 crossref_primary_10_1038_s41594_025_01512_w crossref_primary_10_1016_j_future_2024_03_050 crossref_primary_10_1139_gen_2024_0121 crossref_primary_10_1101_gr_279364_124 crossref_primary_10_3390_v15020522 crossref_primary_10_1038_s41431_024_01649_0 crossref_primary_10_7554_eLife_98300_3 crossref_primary_10_1007_s10142_025_01534_z crossref_primary_10_1038_s41467_024_49588_0 crossref_primary_10_1186_s12864_024_11172_7 crossref_primary_10_1093_hmg_ddae111 crossref_primary_10_1101_gr_278070_123 crossref_primary_10_1016_j_lanwpc_2025_101473 crossref_primary_10_1155_humu_6657400 crossref_primary_10_1016_j_bcp_2025_116874 crossref_primary_10_1038_s41586_023_06425_6 crossref_primary_10_1038_s41576_023_00590_0 crossref_primary_10_1093_bioinformatics_btae539 crossref_primary_10_1038_s41467_024_51252_6 crossref_primary_10_1002_acn3_70008 crossref_primary_10_1038_s41431_025_01817_w crossref_primary_10_1038_s41598_024_80068_z crossref_primary_10_1093_molbev_msaf021 crossref_primary_10_1093_bib_bbae613 crossref_primary_10_1016_j_cub_2024_06_033 crossref_primary_10_1038_s41467_024_53260_y crossref_primary_10_1038_s41525_024_00445_5 crossref_primary_10_1093_hr_uhae119 crossref_primary_10_1093_bioadv_vbad149 crossref_primary_10_3389_pore_2024_1611676 crossref_primary_10_3390_cancers16071275 crossref_primary_10_1093_bfgp_elae003 crossref_primary_10_1186_s13100_024_00320_1 crossref_primary_10_1093_g3journal_jkaf021 crossref_primary_10_1093_nsr_nwae335 crossref_primary_10_1101_gr_279273_124 crossref_primary_10_1128_spectrum_03584_23 crossref_primary_10_1186_s13073_024_01419_z crossref_primary_10_1371_journal_pcbi_1010905 crossref_primary_10_1093_gigascience_giaf018 |
ContentType | Journal Article |
Copyright | 2022. The Author(s), under exclusive licence to Springer Nature America, Inc. Copyright Nature Publishing Group Dec 2022 |
Copyright_xml | – notice: 2022. The Author(s), under exclusive licence to Springer Nature America, Inc. – notice: Copyright Nature Publishing Group Dec 2022 |
DBID | NPM 8FE 8FG AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO GNUQQ HCIFZ JQ2 K7- P5Z P62 PHGZM PHGZT PKEHL PQEST PQGLB PQQKQ PQUKI PRINS |
DOI | 10.1038/s43588-022-00387-x |
DatabaseName | PubMed ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central UK/Ireland Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One ProQuest Central ProQuest Central Student SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database (ProQuest) Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic (New) ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China |
DatabaseTitle | PubMed Advanced Technologies & Aerospace Collection Computer Science Database ProQuest Central Student Technology Collection ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection ProQuest One Academic Eastern Edition SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central Advanced Technologies & Aerospace Database ProQuest One Applied & Life Sciences ProQuest One Academic UKI Edition ProQuest Central Korea ProQuest Central (New) ProQuest One Academic ProQuest One Academic (New) |
DatabaseTitleList | PubMed Advanced Technologies & Aerospace Collection |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Sciences (General) |
EISSN | 2662-8457 |
EndPage | 803 |
ExternalDocumentID | 38177392 |
Genre | Journal Article |
GrantInformation_xml | – fundername: Research Grants Council, University Grants Committee (RGC, UGC) grantid: TRS T21-705/20-N |
GroupedDBID | 0R~ AARCD AAYZH ABJNI ACBWK AFANA AFKRA AFSHS AFWHJ AGHDO AIBTJ ALMA_UNASSIGNED_HOLDINGS ATHPR BGLVJ CCPQU K7- NFIDA NPM ODYON PHGZM PHGZT PQGLB RNT SNYQT SOJ 8FE 8FG ARAPS AZQEC BENPR DWQXO GNUQQ HCIFZ JQ2 P62 PKEHL PQEST PQQKQ PQUKI PRINS |
ID | FETCH-LOGICAL-c331t-d165ecb341c53b904305083def75b3f6169305b3743dd0d0d10b48b4e4c11a603 |
IEDL.DBID | 8FG |
IngestDate | Sat Aug 23 12:44:10 EDT 2025 Mon Jul 21 06:02:39 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 12 |
Language | English |
License | 2022. The Author(s), under exclusive licence to Springer Nature America, Inc. |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c331t-d165ecb341c53b904305083def75b3f6169305b3743dd0d0d10b48b4e4c11a603 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0002-8560-3999 0000-0002-6546-2324 0000-0001-9711-6533 |
PMID | 38177392 |
PQID | 3225996526 |
PQPubID | 7343593 |
PageCount | 7 |
ParticipantIDs | proquest_journals_3225996526 pubmed_primary_38177392 |
PublicationCentury | 2000 |
PublicationDate | 2022-12-01 |
PublicationDateYYYYMMDD | 2022-12-01 |
PublicationDate_xml | – month: 12 year: 2022 text: 2022-12-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | United States |
PublicationPlace_xml | – name: United States – name: New York |
PublicationTitle | Nature Computational Science |
PublicationTitleAlternate | Nat Comput Sci |
PublicationYear | 2022 |
Publisher | Nature Publishing Group |
Publisher_xml | – name: Nature Publishing Group |
SSID | ssj0002513063 |
Score | 2.5833082 |
Snippet | Deep learning-based variant callers are becoming the standard and have achieved superior single nucleotide polymorphisms calling performance using long reads.... |
SourceID | proquest pubmed |
SourceType | Aggregation Database Index Database |
StartPage | 797 |
SubjectTerms | Algorithms Alignment Deep learning Genomes Neural networks Nucleotides |
Title | Symphonizing pileup and full-alignment for deep learning-based long-read variant calling |
URI | https://www.ncbi.nlm.nih.gov/pubmed/38177392 https://www.proquest.com/docview/3225996526 |
Volume | 2 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV07T8MwELagXVgQ5VkolQcGGKwmtZ3HhAC1VEitKqBStyh-pEKCNJAWAb-eu9SFCZTFtpTl7uzv8935jpCzKOQaUFqywHohEyLNWCythqkGPDE69lO8KA5HwWAi7qZy6hxupUurXJ-J1UFt5hp95B00PODmshtcFq8Mu0ZhdNW10NgkdR-QBi086t_--FgAu4ERc_dWxuNRpwR2AKaBKewYEwvZx9_MskKY_g7ZdtSQXq102SAbNt8lDbf5SnruKkRf7JHpw-cL5pQ_fQHw0AI29rKgaW4oOtMZMOtZFeOnQEipsbagrjfEjCFoGfo8hyGwRUPf4aoMsqWgKXyXvk8m_d7jzYC5FglMc-4vmPEDkKwCKNKSq7gq4AWkytgslIpnAZZa8WAEPMEYDz7fUyJSwgrt-2ng8QNSy-e5PSLUdkWWAf1SqUyFSXlkItHVUQjLntSxapLWWlCJs_My-dVKkxyuhJcUqzIZCdb-C4F-Hf__4wnZ6qI-qhSRFqkt3pb2FIB-odqVNtukft0bje9hNhoPvwHVMqoy |
linkProvider | ProQuest |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV07T8MwED6VMsCCKM9CAQ8gwWA1iZ3XgBACSksfC63ULSS2UyFBG0gLlB_Fb-ScpjDBVmVxIiWKfJ_vvjuf7wCOPZcJtNI2dZThUs7DmPq2Engr0J5I4ZuhdhTbHafe43d9u1-Ar_lZGJ1WOdeJmaKWI6Fj5FUNPOTmtuVcJC9Ud43Su6vzFhozWDTV9B1dtvS8cY3yPbGs2k33qk7zrgJUMGaOqTQd_JkItbewWeRnNa-Qh0gVu3bEYkdXJzFwhKZVSgMv04i4F3HFhWmGjsHwu0uwzBnzdQqhV7v9iekgV0AGzvKzOQbzqimyEYSiTpnXe3Au_fibyWYWrbYOazkVJZcz7JSgoIYbUMoXe0pO84rUZ5vQv58-6xz2x080dCRBRTJJSDiURAfvKTL5QZZTQJAAE6lUQvJeFAOqjaQkTyMcIjuV5A1dc5QlQWToc_Bb0FvI5G1DcTgaql0gyuJxjHQvCu2Qy5B50uOW8Fx8bNjCj8pQmU9UkK-rNPhFQRl2ZpMXJLOyHIGuNegi3dv7_8UjWKl3262g1eg092HV0rLJ0lMqUBy_TtQBkoxxdJhJlsDDoqH0Da-u4sQ |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Symphonizing+pileup+and+full-alignment+for+deep+learning-based+long-read+variant+calling&rft.jtitle=Nature+Computational+Science&rft.au=Zheng%2C+Zhenxian&rft.au=Li%2C+Shumin&rft.au=Su%2C+Junhao&rft.au=Leung%2C+Amy+Wing-Sze&rft.date=2022-12-01&rft.pub=Nature+Publishing+Group&rft.eissn=2662-8457&rft.volume=2&rft.issue=12&rft.spage=797&rft.epage=803&rft_id=info:doi/10.1038%2Fs43588-022-00387-x |