Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads

Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read-based phasing. Third-generation nanopore sequence data have demonstrated a long read length, but current interpretati...

Full description

Saved in:
Bibliographic Details
Published inNature methods Vol. 18; no. 11; pp. 1322 - 1332
Main Authors Shafin, Kishwar, Pesout, Trevor, Chang, Pi-Chuan, Nattestad, Maria, Kolesnikov, Alexey, Goel, Sidharth, Baid, Gunjan, Kolmogorov, Mikhail, Eizenga, Jordan M., Miga, Karen H., Carnevali, Paolo, Jain, Miten, Carroll, Andrew, Paten, Benedict
Format Journal Article
LanguageEnglish
Published New York Nature Publishing Group US 01.11.2021
Nature Publishing Group
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read-based phasing. Third-generation nanopore sequence data have demonstrated a long read length, but current interpretation methods for their novel pore-based signal have unique error profiles, making accurate analysis challenging. Here, we introduce a haplotype-aware variant calling pipeline, PEPPER-Margin-DeepVariant, that produces state-of-the-art variant calling results with nanopore data. We show that our nanopore-based method outperforms the short-read-based single-nucleotide-variant identification method at the whole-genome scale and produces high-quality single-nucleotide variants in segmental duplications and low-mappability regions where short-read-based genotyping fails. We show that our pipeline can provide highly contiguous phase blocks across the genome with nanopore reads, contiguously spanning between 85% and 92% of annotated genes across six samples. We also extend PEPPER-Margin-DeepVariant to PacBio HiFi data, providing an efficient solution with superior performance over the current WhatsHap-DeepVariant standard. Finally, we demonstrate de novo assembly polishing methods that use nanopore and PacBio HiFi reads to produce diploid assemblies with high accuracy (Q35+ nanopore-polished and Q40+ PacBio HiFi-polished). The PEPPER-Margin-DeepVariant pipeline achieves highly accurate variant calling using nanopore and other long-read sequencing data.
AbstractList Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read-based phasing. Third-generation nanopore sequence data have demonstrated a long read length, but current interpretation methods for their novel pore-based signal have unique error profiles, making accurate analysis challenging. Here, we introduce a haplotype-aware variant calling pipeline, PEPPER-Margin-DeepVariant, that produces state-of-the-art variant calling results with nanopore data. We show that our nanopore-based method outperforms the short-read-based single-nucleotide-variant identification method at the whole-genome scale and produces high-quality single-nucleotide variants in segmental duplications and low-mappability regions where short-read-based genotyping fails. We show that our pipeline can provide highly contiguous phase blocks across the genome with nanopore reads, contiguously spanning between 85% and 92% of annotated genes across six samples. We also extend PEPPER-Margin-DeepVariant to PacBio HiFi data, providing an efficient solution with superior performance over the current WhatsHap-DeepVariant standard. Finally, we demonstrate de novo assembly polishing methods that use nanopore and PacBio HiFi reads to produce diploid assemblies with high accuracy (Q35+ nanopore-polished and Q40+ PacBio HiFi-polished).
Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read-based phasing. Third-generation nanopore sequence data have demonstrated a long read length, but current interpretation methods for their novel pore-based signal have unique error profiles, making accurate analysis challenging. Here, we introduce a haplotype-aware variant calling pipeline, PEPPER-Margin-DeepVariant, that produces state-of-the-art variant calling results with nanopore data. We show that our nanopore-based method outperforms the short-read-based single-nucleotide-variant identification method at the whole-genome scale and produces high-quality single-nucleotide variants in segmental duplications and low-mappability regions where short-read-based genotyping fails. We show that our pipeline can provide highly contiguous phase blocks across the genome with nanopore reads, contiguously spanning between 85% and 92% of annotated genes across six samples. We also extend PEPPER-Margin-DeepVariant to PacBio HiFi data, providing an efficient solution with superior performance over the current WhatsHap-DeepVariant standard. Finally, we demonstrate de novo assembly polishing methods that use nanopore and PacBio HiFi reads to produce diploid assemblies with high accuracy (Q35+ nanopore-polished and Q40+ PacBio HiFi-polished).The PEPPER-Margin-DeepVariant pipeline achieves highly accurate variant calling using nanopore and other long-read sequencing data.
Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read-based phasing. Third-generation nanopore sequence data have demonstrated a long read length, but current interpretation methods for their novel pore-based signal have unique error profiles, making accurate analysis challenging. Here, we introduce a haplotype-aware variant calling pipeline, PEPPER-Margin-DeepVariant, that produces state-of-the-art variant calling results with nanopore data. We show that our nanopore-based method outperforms the short-read-based single-nucleotide-variant identification method at the whole-genome scale and produces high-quality single-nucleotide variants in segmental duplications and low-mappability regions where short-read-based genotyping fails. We show that our pipeline can provide highly contiguous phase blocks across the genome with nanopore reads, contiguously spanning between 85% and 92% of annotated genes across six samples. We also extend PEPPER-Margin-DeepVariant to PacBio HiFi data, providing an efficient solution with superior performance over the current WhatsHap-DeepVariant standard. Finally, we demonstrate de novo assembly polishing methods that use nanopore and PacBio HiFi reads to produce diploid assemblies with high accuracy (Q35+ nanopore-polished and Q40+ PacBio HiFi-polished).Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read-based phasing. Third-generation nanopore sequence data have demonstrated a long read length, but current interpretation methods for their novel pore-based signal have unique error profiles, making accurate analysis challenging. Here, we introduce a haplotype-aware variant calling pipeline, PEPPER-Margin-DeepVariant, that produces state-of-the-art variant calling results with nanopore data. We show that our nanopore-based method outperforms the short-read-based single-nucleotide-variant identification method at the whole-genome scale and produces high-quality single-nucleotide variants in segmental duplications and low-mappability regions where short-read-based genotyping fails. We show that our pipeline can provide highly contiguous phase blocks across the genome with nanopore reads, contiguously spanning between 85% and 92% of annotated genes across six samples. We also extend PEPPER-Margin-DeepVariant to PacBio HiFi data, providing an efficient solution with superior performance over the current WhatsHap-DeepVariant standard. Finally, we demonstrate de novo assembly polishing methods that use nanopore and PacBio HiFi reads to produce diploid assemblies with high accuracy (Q35+ nanopore-polished and Q40+ PacBio HiFi-polished).
Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read based phasing. Third-generation nanopore sequence data has demonstrated a long read length, but current interpretation methods for its novel pore-based signal have unique error profiles, making accurate analysis challenging. Here, we introduce a haplotype-aware variant calling pipeline PEPPER-Margin-DeepVariant that produces state-of-the-art variant calling results with nanopore data. We show that our nanopore-based method outperforms the short-read-based single nucleotide variant identification method at the whole genome-scale and produces high quality single nucleotide variants in segmental duplications and low-mappability regions where short-read based genotyping fails. We show that our pipeline can provide highly-contiguous phase blocks across the genome with nanopore reads, contiguously spanning between 85% to 92% of annotated genes across six samples. We also extend PEPPER-Margin-DeepVariant to PacBio HiFi data, providing an efficient solution with superior performance than the current WhatsHap-DeepVariant standard. Finally, we demonstrate de novo assembly polishing methods that use nanopore and PacBio HiFi reads to produce diploid assemblies with high accuracy (Q35+ nanopore-polished and Q40+ PacBio-HiFi-polished). Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read-based phasing. Third-generation nanopore sequence data have demonstrated a long read length, but current interpretation methods for their novel pore-based signal have unique error profiles, making accurate analysis challenging. Here, we introduce a haplotype-aware variant calling pipeline, PEPPER-Margin-DeepVariant, that produces state-of-the-art variant calling results with nanopore data. We show that our nanopore-based method outperforms the short-read-based single-nucleotide-variant identification method at the whole-genome scale and produces high-quality single-nucleotide variants in segmental duplications and low-mappability regions where short-read-based genotyping fails. We show that our pipeline can provide highly contiguous phase blocks across the genome with nanopore reads, contiguously spanning between 85% and 92% of annotated genes across six samples. We also extend PEPPER-Margin-DeepVariant to PacBio HiFi data, providing an efficient solution with superior performance over the current WhatsHap-DeepVariant standard. Finally, we demonstrate de novo assembly polishing methods that use nanopore and PacBio HiFi reads to produce diploid assemblies with high accuracy (Q35+ nanopore-polished and Q40+ PacBio HiFi-polished). The PEPPER-Margin-DeepVariant pipeline achieves highly accurate variant calling using nanopore and other long-read sequencing data.
Audience Academic
Author Carroll, Andrew
Pesout, Trevor
Nattestad, Maria
Carnevali, Paolo
Paten, Benedict
Miga, Karen H.
Kolesnikov, Alexey
Goel, Sidharth
Jain, Miten
Eizenga, Jordan M.
Baid, Gunjan
Shafin, Kishwar
Chang, Pi-Chuan
Kolmogorov, Mikhail
AuthorAffiliation 3 Chan Zuckerberg Initiative, Redwood City, CA, USA
1 UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
2 Google Inc, 1600 Amphitheatre Pkwy, Mountain View, CA
AuthorAffiliation_xml – name: 2 Google Inc, 1600 Amphitheatre Pkwy, Mountain View, CA
– name: 3 Chan Zuckerberg Initiative, Redwood City, CA, USA
– name: 1 UC Santa Cruz Genomics Institute, Santa Cruz, CA, USA
Author_xml – sequence: 1
  givenname: Kishwar
  orcidid: 0000-0001-5252-3434
  surname: Shafin
  fullname: Shafin, Kishwar
  organization: UC Santa Cruz Genomics Institute
– sequence: 2
  givenname: Trevor
  surname: Pesout
  fullname: Pesout, Trevor
  organization: UC Santa Cruz Genomics Institute
– sequence: 3
  givenname: Pi-Chuan
  orcidid: 0000-0003-3021-6446
  surname: Chang
  fullname: Chang, Pi-Chuan
  organization: Google Inc
– sequence: 4
  givenname: Maria
  surname: Nattestad
  fullname: Nattestad, Maria
  organization: Google Inc
– sequence: 5
  givenname: Alexey
  surname: Kolesnikov
  fullname: Kolesnikov, Alexey
  organization: Google Inc
– sequence: 6
  givenname: Sidharth
  surname: Goel
  fullname: Goel, Sidharth
  organization: Google Inc
– sequence: 7
  givenname: Gunjan
  surname: Baid
  fullname: Baid, Gunjan
  organization: Google Inc
– sequence: 8
  givenname: Mikhail
  surname: Kolmogorov
  fullname: Kolmogorov, Mikhail
  organization: UC Santa Cruz Genomics Institute
– sequence: 9
  givenname: Jordan M.
  surname: Eizenga
  fullname: Eizenga, Jordan M.
  organization: UC Santa Cruz Genomics Institute
– sequence: 10
  givenname: Karen H.
  orcidid: 0000-0002-3670-4507
  surname: Miga
  fullname: Miga, Karen H.
  organization: UC Santa Cruz Genomics Institute
– sequence: 11
  givenname: Paolo
  surname: Carnevali
  fullname: Carnevali, Paolo
  organization: Chan Zuckerberg Initiative
– sequence: 12
  givenname: Miten
  orcidid: 0000-0002-4571-3982
  surname: Jain
  fullname: Jain, Miten
  organization: UC Santa Cruz Genomics Institute
– sequence: 13
  givenname: Andrew
  orcidid: 0000-0002-4824-6689
  surname: Carroll
  fullname: Carroll, Andrew
  email: awcarroll@google.com
  organization: Google Inc
– sequence: 14
  givenname: Benedict
  orcidid: 0000-0001-8863-3539
  surname: Paten
  fullname: Paten, Benedict
  email: bpaten@ucsc.edu
  organization: UC Santa Cruz Genomics Institute
BackLink https://www.ncbi.nlm.nih.gov/pubmed/34725481$$D View this record in MEDLINE/PubMed
BookMark eNp9kk1v1DAQhiNURD_gD3BAkbhwcfFHnI8LUlUWilTECgFXa9aZZF28drCTrvbf4-1uW1qhygdbnud9xzOe4-zAeYdZ9prRU0ZF_T4WTDacUM4IZbxpyPpZdsRkUZOKUXlwe6YNO8yOY7yiVIiCyxfZoSgqnmLsKPt9AYP142ZAAmsImF9DMODGXIO1xvX52ozLfD6bz2ffyVcIvXHkI-Lwa4-hg4XFmC9Nv8xB6ymA3uTG5Q6cH3wytN71JCC08WX2vAMb8dV-P8l-fpr9OL8gl98-fzk_uyRaFmwkXVVWrEFE2S5a3TGpF62EuuO0FhIlA15WC9EVRaULqptOyFYw3hW85U3NeS1Osg8732FarLDV6MYAVg3BrCBslAejHkacWareX6tapsYxmQze7Q2C_zNhHNXKRI3WgkM_RcVT2wVjrBQJffsIvfJTcKm8G6oqCyHpPdWDRWVc51NevTVVZ2XNeF2mhyfq9D9UWi2ujE5_35l0_0Dw5t9C7yq8_d4E8B2gg48xYHeHMKq2M6R2M6TSDKmbGVLrJKofibQZYTR-2yxjn5aKnTSmPK7HcN-NJ1R_Aetb2y8
CitedBy_id crossref_primary_10_1038_s41467_022_30680_2
crossref_primary_10_1016_j_cub_2023_12_042
crossref_primary_10_1186_s12864_023_09417_y
crossref_primary_10_1016_j_jid_2024_01_020
crossref_primary_10_1186_s13073_024_01391_8
crossref_primary_10_1186_s12859_023_05596_3
crossref_primary_10_1080_14737159_2023_2241365
crossref_primary_10_1038_s41467_024_50079_5
crossref_primary_10_1093_nar_gkad1010
crossref_primary_10_1007_s11427_024_2742_y
crossref_primary_10_3389_fgene_2024_1493295
crossref_primary_10_1186_s12859_023_05193_4
crossref_primary_10_1016_j_gpb_2023_08_001
crossref_primary_10_1093_bib_bbae473
crossref_primary_10_1016_j_omtm_2024_101231
crossref_primary_10_1093_bioinformatics_btac824
crossref_primary_10_1016_j_jmoldx_2024_08_002
crossref_primary_10_1093_bioinformatics_btac827
crossref_primary_10_1093_nargab_lqac033
crossref_primary_10_3390_genes13091583
crossref_primary_10_1093_gbe_evac106
crossref_primary_10_1186_s13059_023_03061_1
crossref_primary_10_1371_journal_pbio_3002697
crossref_primary_10_1093_g3journal_jkaf044
crossref_primary_10_1038_s41591_025_03562_5
crossref_primary_10_1136_jnnp_2024_333541
crossref_primary_10_1038_s41586_024_07788_0
crossref_primary_10_3390_v16121868
crossref_primary_10_1126_science_abj6987
crossref_primary_10_1038_s42003_024_06981_1
crossref_primary_10_1016_j_ymthe_2024_11_025
crossref_primary_10_1038_s41598_022_10048_8
crossref_primary_10_1002_humu_24465
crossref_primary_10_1038_s41439_024_00276_x
crossref_primary_10_1038_s41467_022_28852_1
crossref_primary_10_3390_genes15121551
crossref_primary_10_1038_s41592_022_01515_1
crossref_primary_10_1093_nargab_lqad033
crossref_primary_10_3389_fgene_2022_887644
crossref_primary_10_1007_s12185_025_03929_x
crossref_primary_10_1038_s41588_024_01808_5
crossref_primary_10_1186_s12859_023_05434_6
crossref_primary_10_1016_j_fsigen_2024_103156
crossref_primary_10_1371_journal_pcbi_1012732
crossref_primary_10_1016_j_ejmg_2022_104690
crossref_primary_10_1038_s41592_024_02424_1
crossref_primary_10_1007_s00239_023_10102_7
crossref_primary_10_1126_science_abl3533
crossref_primary_10_1186_s12864_023_09343_z
crossref_primary_10_1016_j_yamp_2023_08_004
crossref_primary_10_1038_s41467_024_47349_7
crossref_primary_10_1093_bioadv_vbac095
crossref_primary_10_3390_life12111939
crossref_primary_10_1038_s41408_024_01108_5
crossref_primary_10_1038_s41559_023_02243_1
crossref_primary_10_1186_s13148_025_01832_0
crossref_primary_10_3389_fgene_2024_1439153
crossref_primary_10_1182_bloodadvances_2022007133
crossref_primary_10_1038_s41587_022_01580_z
crossref_primary_10_1002_mds_30077
crossref_primary_10_1038_s41587_021_01158_1
crossref_primary_10_1093_gbe_evad148
crossref_primary_10_1016_j_fsigen_2024_103048
crossref_primary_10_1038_s41467_024_53087_7
crossref_primary_10_1128_spectrum_02082_24
crossref_primary_10_3389_fgene_2023_1213457
crossref_primary_10_1002_ana_27155
crossref_primary_10_1093_gbe_evad020
crossref_primary_10_1038_s41587_022_01435_7
crossref_primary_10_1038_s41592_022_01440_3
crossref_primary_10_1038_s41592_024_02262_1
crossref_primary_10_1093_bib_bbac301
crossref_primary_10_1038_s41467_022_33530_3
crossref_primary_10_7717_peerj_17731
crossref_primary_10_1038_s41467_023_39784_9
crossref_primary_10_1093_infdis_jiad523
crossref_primary_10_1038_s41525_023_00366_9
crossref_primary_10_1038_s41598_023_35791_4
crossref_primary_10_1038_s41467_024_44804_3
crossref_primary_10_1186_s13059_024_03394_5
crossref_primary_10_1093_nar_gkac510
crossref_primary_10_1186_s13059_024_03297_5
crossref_primary_10_1016_j_jgar_2022_08_006
crossref_primary_10_1016_j_xgen_2022_100233
crossref_primary_10_3389_fgene_2022_1008792
crossref_primary_10_1093_g3journal_jkad077
crossref_primary_10_3390_ijms232113244
crossref_primary_10_1056_NEJMra2204787
crossref_primary_10_1101_gr_278070_123
crossref_primary_10_1016_j_drudis_2024_103990
crossref_primary_10_7717_peerj_18132
crossref_primary_10_1186_s13073_022_01026_w
crossref_primary_10_1161_CIRCGEN_121_003591
crossref_primary_10_1080_13816810_2022_2141797
crossref_primary_10_1101_gr_277031_122
crossref_primary_10_1136_jmg_2023_109341
crossref_primary_10_1186_s12711_023_00783_5
crossref_primary_10_3390_jof9030301
crossref_primary_10_1038_s41467_022_35650_2
crossref_primary_10_1038_s41592_024_02168_y
crossref_primary_10_1172_jci_insight_183902
crossref_primary_10_1038_s41586_023_06425_6
crossref_primary_10_1038_s41576_023_00590_0
crossref_primary_10_1038_s41592_023_02141_1
crossref_primary_10_1038_s10038_024_01275_0
crossref_primary_10_1056_NEJMc2112090
crossref_primary_10_1186_s13059_024_03301_y
crossref_primary_10_1093_bioinformatics_btad722
crossref_primary_10_1016_j_canlet_2024_217121
crossref_primary_10_1098_rstb_2021_0195
crossref_primary_10_1002_2211_5463_13868
crossref_primary_10_1093_bioadv_vbac054
crossref_primary_10_1038_s43588_022_00387_x
crossref_primary_10_1093_brain_awac377
crossref_primary_10_1038_s41525_024_00394_z
crossref_primary_10_1038_s41586_023_06457_y
crossref_primary_10_1093_bioadv_vbad149
crossref_primary_10_1159_000530652
crossref_primary_10_3390_diagnostics13030373
crossref_primary_10_1093_bfgp_elae003
crossref_primary_10_1016_j_ajhg_2024_10_006
crossref_primary_10_1038_s41422_023_00849_5
crossref_primary_10_1093_bib_bbad087
crossref_primary_10_1038_s41467_024_47562_4
crossref_primary_10_1093_nsr_nwae335
crossref_primary_10_1038_s41586_022_05249_0
crossref_primary_10_1101_gr_279273_124
crossref_primary_10_1038_s41592_023_01993_x
crossref_primary_10_1038_s41592_023_01932_w
crossref_primary_10_1172_jci_insight_188216
crossref_primary_10_1093_gigascience_giaf018
crossref_primary_10_1186_s13073_023_01194_3
crossref_primary_10_1007_s00438_024_02158_x
crossref_primary_10_1038_s41587_022_01221_5
crossref_primary_10_1038_s41588_024_01719_5
Cites_doi 10.1126/science.1162986
10.1038/s41587-019-0072-8
10.1038/sdata.2016.25
10.1089/cmb.2014.0157
10.1038/nrg2950
10.1038/s41586-020-2547-7
10.1186/s13059-019-1709-0
10.1038/nbt.4109
10.1101/gr.213462.116
10.1038/s41592-018-0001-7
10.1101/gr.263566.120
10.1038/s41467-020-18564-9
10.3389/fgene.2014.00381
10.1038/nmeth.3290
10.1038/nrg1322
10.1186/s13059-018-1462-9
10.1038/s41587-019-0074-6
10.1038/nbt.4277
10.1016/j.ymeth.2012.05.001
10.1007/s00401-017-1743-5
10.1038/nature11632
10.1038/s41592-018-0054-7
10.1101/gr.214874.116
10.1093/nar/gky955
10.1038/nrg3054
10.1038/nbt.4235
10.1038/s41587-019-0217-9
10.1186/s13073-014-0073-7
10.1093/bioinformatics/btp352
10.1093/bioinformatics/bty191
10.1016/0304-4076(88)90048-6
10.1038/nbt.4060
10.1038/nmeth.4035
10.1038/s41587-019-0054-x
10.1101/gr.135350.111
10.1101/gr.233460.117
10.1038/s41587-020-0719-5
10.1101/gr.214007.116
10.1089/cmb.2014.0029
10.1101/gr.107524.110
10.1038/s42256-020-0167-4
10.1038/s41592-019-0669-3
10.1038/s41467-019-12493-y
10.3389/fimmu.2020.02136
10.1016/j.semcdb.2013.04.005
10.1101/2020.07.24.212712
10.1038/s41587-020-0503-6
10.1101/2020.11.13.380741
10.1038/s41587-020-0538-8
10.1038/s41586-021-03420-7
10.1101/2020.12.11.422022
10.5281/zenodo.5275510
10.1038/s41592-020-01056-5
10.1101/2020.11.01.363887
ContentType Journal Article
Copyright The Author(s), under exclusive licence to Springer Nature America, Inc. 2021
2021. The Author(s), under exclusive licence to Springer Nature America, Inc.
COPYRIGHT 2021 Nature Publishing Group
The Author(s), under exclusive licence to Springer Nature America, Inc. 2021.
Copyright_xml – notice: The Author(s), under exclusive licence to Springer Nature America, Inc. 2021
– notice: 2021. The Author(s), under exclusive licence to Springer Nature America, Inc.
– notice: COPYRIGHT 2021 Nature Publishing Group
– notice: The Author(s), under exclusive licence to Springer Nature America, Inc. 2021.
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
3V.
7QL
7QO
7SS
7TK
7U9
7X2
7X7
7XB
88E
88I
8AO
8FD
8FE
8FG
8FH
8FI
8FJ
8FK
ABJCF
ABUWG
AEUYN
AFKRA
ARAPS
ATCPS
AZQEC
BBNVY
BENPR
BGLVJ
BHPHI
BKSAR
C1K
CCPQU
D1I
DWQXO
FR3
FYUFA
GHDGH
GNUQQ
H94
HCIFZ
K9.
KB.
L6V
LK8
M0K
M0S
M1P
M2P
M7N
M7P
M7S
P5Z
P62
P64
PATMY
PCBAR
PDBOC
PHGZM
PHGZT
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
PYCSY
Q9U
RC3
7X8
5PM
DOI 10.1038/s41592-021-01299-w
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
ProQuest Central (Corporate)
Bacteriology Abstracts (Microbiology B)
Biotechnology Research Abstracts
Entomology Abstracts (Full archive)
Neurosciences Abstracts
Virology and AIDS Abstracts
Agricultural Science Collection
Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Medical Database (Alumni Edition)
Science Database (Alumni Edition)
ProQuest Pharma Collection
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Natural Science Collection
Hospital Premium Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
Materials Science & Engineering Collection
ProQuest Central (Alumni Edition)
ProQuest One Sustainability
ProQuest Central UK/Ireland
Advanced Technologies & Aerospace Collection
Agricultural & Environmental Science Collection
ProQuest Central Essentials
Biological Science Collection
ProQuest Central
Technology Collection
Natural Science Collection
Earth, Atmospheric & Aquatic Science Collection
Environmental Sciences and Pollution Management
ProQuest One Community College
ProQuest Materials Science Collection
ProQuest Central Korea
Engineering Research Database
Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Central Student
AIDS and Cancer Research Abstracts
SciTech Premium Collection
ProQuest Health & Medical Complete (Alumni)
Materials Science Database
ProQuest Engineering Collection
ProQuest Biological Science Collection
Agricultural Science Database
Health & Medical Collection (Alumni Edition)
Medical Database
Science Database
Algology Mycology and Protozoology Abstracts (Microbiology C)
Biological Science Database
Engineering Database
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
Biotechnology and BioEngineering Abstracts
Environmental Science Database
Earth, Atmospheric & Aquatic Science Database
Materials Science Collection
ProQuest Central Premium
ProQuest One Academic (New)
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
Engineering Collection
Environmental Science Collection
ProQuest Central Basic
Genetics Abstracts
MEDLINE - Academic
PubMed Central (Full Participant titles)
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Agricultural Science Database
ProQuest Central Student
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
SciTech Premium Collection
ProQuest Central China
Environmental Sciences and Pollution Management
ProQuest One Applied & Life Sciences
ProQuest One Sustainability
Health Research Premium Collection
Natural Science Collection
Health & Medical Research Collection
Biological Science Collection
ProQuest Central (New)
ProQuest Medical Library (Alumni)
Engineering Collection
Advanced Technologies & Aerospace Collection
Engineering Database
Virology and AIDS Abstracts
ProQuest Science Journals (Alumni Edition)
ProQuest Biological Science Collection
ProQuest One Academic Eastern Edition
Earth, Atmospheric & Aquatic Science Database
Agricultural Science Collection
ProQuest Hospital Collection
ProQuest Technology Collection
Health Research Premium Collection (Alumni)
Biological Science Database
Neurosciences Abstracts
ProQuest Hospital Collection (Alumni)
Biotechnology and BioEngineering Abstracts
Environmental Science Collection
Entomology Abstracts
ProQuest Health & Medical Complete
ProQuest One Academic UKI Edition
Environmental Science Database
Engineering Research Database
ProQuest One Academic
ProQuest One Academic (New)
Technology Collection
Technology Research Database
ProQuest One Academic Middle East (New)
Materials Science Collection
ProQuest Health & Medical Complete (Alumni)
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Natural Science Collection
ProQuest Pharma Collection
ProQuest Central
Earth, Atmospheric & Aquatic Science Collection
ProQuest Health & Medical Research Collection
Genetics Abstracts
ProQuest Engineering Collection
Biotechnology Research Abstracts
Health and Medicine Complete (Alumni Edition)
ProQuest Central Korea
Bacteriology Abstracts (Microbiology B)
Algology Mycology and Protozoology Abstracts (Microbiology C)
Agricultural & Environmental Science Collection
AIDS and Cancer Research Abstracts
Materials Science Database
ProQuest Materials Science Collection
ProQuest Central Basic
ProQuest Science Journals
ProQuest SciTech Collection
Advanced Technologies & Aerospace Database
ProQuest Medical Library
Materials Science & Engineering Collection
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList
Agricultural Science Database
MEDLINE - Academic


MEDLINE

Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 3
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1548-7105
EndPage 1332
ExternalDocumentID PMC8571015
A681286822
34725481
10_1038_s41592_021_01299_w
Genre Journal Article
Research Support, N.I.H., Extramural
GeographicLocations United States
GeographicLocations_xml – name: United States
GrantInformation_xml – fundername: U.S. Department of Health & Human Services | NIH | National Human Genome Research Institute (NHGRI)
  grantid: U41HG010972; R01HG010485; U01HG010961; OT2OD026682
  funderid: https://doi.org/10.13039/100000051
– fundername: NHGRI NIH HHS
  grantid: U01 HG010961
– fundername: NIH HHS
  grantid: OT2 OD026682
– fundername: NHGRI NIH HHS
  grantid: U24 HG010262
– fundername: NHGRI NIH HHS
  grantid: R01 HG010485
– fundername: NHGRI NIH HHS
  grantid: U41 HG010972
GroupedDBID ---
-~X
0R~
123
29M
39C
3V.
4.4
53G
5BI
7X2
7X7
7XC
88E
88I
8AO
8CJ
8FE
8FG
8FH
8FI
8FJ
8R4
8R5
AAEEF
AAHBH
AARCD
AAYZH
AAZLF
ABAWZ
ABDBF
ABJCF
ABJNI
ABLJU
ABUWG
ACBWK
ACGFS
ACGOD
ACIWK
ACPRK
ACUHS
ADBBV
AENEX
AEUYN
AFANA
AFBBN
AFKRA
AFRAH
AFSHS
AGAYW
AHBCP
AHMBA
AHSBF
AIBTJ
ALFFA
ALIPV
ALMA_UNASSIGNED_HOLDINGS
ARAPS
ARMCB
ASPBG
ATCPS
AVWKF
AXYYD
AZFZN
AZQEC
BBNVY
BENPR
BGLVJ
BHPHI
BKKNO
BKSAR
BPHCQ
BVXVI
CCPQU
CS3
D1I
D1J
D1K
DB5
DU5
DWQXO
EBS
EE.
EJD
EMOBN
ESX
F5P
FEDTE
FSGXE
FYUFA
FZEXT
GNUQQ
HCIFZ
HMCUK
HVGLF
HZ~
IAO
IHR
INH
INR
ITC
K6-
KB.
L6V
LK5
LK8
M0K
M1P
M2P
M7P
M7R
M7S
NNMJJ
O9-
ODYON
P2P
P62
PATMY
PCBAR
PDBOC
PQQKQ
PROAC
PSQYO
PTHSS
PYCSY
Q2X
RNS
RNT
RNTTT
SHXYY
SIXXV
SJN
SNYQT
SOJ
SV3
TAOOD
TBHMF
TDRGL
TSG
TUS
UKHRP
~8M
AAYXX
ATHPR
CITATION
PHGZM
PHGZT
CGR
CUY
CVF
ECM
EIF
NFIDA
NPM
PMFND
7QL
7QO
7SS
7TK
7U9
7XB
8FD
8FK
C1K
FR3
H94
K9.
M7N
P64
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQUKI
PRINS
Q9U
RC3
7X8
5PM
ID FETCH-LOGICAL-c541t-f76719eee5dbdcf15cbd5a8f20835e51a267b3f447c40c9f35d312f42d2982283
IEDL.DBID 7X7
ISSN 1548-7091
1548-7105
IngestDate Thu Aug 21 18:33:49 EDT 2025
Fri Jul 11 13:03:58 EDT 2025
Sat Aug 23 12:40:25 EDT 2025
Tue Jun 17 21:54:49 EDT 2025
Tue Jun 10 20:38:07 EDT 2025
Sun Apr 06 01:21:14 EDT 2025
Thu Apr 24 22:59:38 EDT 2025
Tue Jul 01 00:44:36 EDT 2025
Fri Feb 21 02:37:43 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 11
Language English
License 2021. The Author(s), under exclusive licence to Springer Nature America, Inc.
Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c541t-f76719eee5dbdcf15cbd5a8f20835e51a267b3f447c40c9f35d312f42d2982283
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
These authors contributed equally.
B.P. and A.C. designed and executed the study. K.S. developed PEPPER. T.P. developed Margin. P.C. designed candidate import functionality in DeepVariant. K.S., T.P. P.C. contributed equally to the methods development and core analysis presented. M.N. designed alt-event alignment in DeepVariant, A.K. contributed to haplotype sorting and improvements on DeepVariant runtime, S.G. contributed to candidate import module of DeepVariant, G.B. designed and executed the post-processing model to improve multiallelic variant accuracy. M.K. designed and evaluated assembly polishing. J.M.E. designed local phasing metric and contributed to phasing evaluation. K.H.M. provided experimental design guidance, P.C. generated assemblies and provided guidance on assembly polishing. M.J. performed nanopore sequencing, quality control and helped to design and execute analysis. All authors approve of the final manuscript.
Author Contributions
ORCID 0000-0002-4824-6689
0000-0003-3021-6446
0000-0002-4571-3982
0000-0001-8863-3539
0000-0001-5252-3434
0000-0002-3670-4507
OpenAccessLink https://pubmed.ncbi.nlm.nih.gov/PMC8571015
PMID 34725481
PQID 2592764350
PQPubID 28015
PageCount 11
ParticipantIDs pubmedcentral_primary_oai_pubmedcentral_nih_gov_8571015
proquest_miscellaneous_2592311163
proquest_journals_2592764350
gale_infotracmisc_A681286822
gale_infotracacademiconefile_A681286822
pubmed_primary_34725481
crossref_primary_10_1038_s41592_021_01299_w
crossref_citationtrail_10_1038_s41592_021_01299_w
springer_journals_10_1038_s41592_021_01299_w
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 20211100
2021-11-01
2021-11-00
20211101
PublicationDateYYYYMMDD 2021-11-01
PublicationDate_xml – month: 11
  year: 2021
  text: 20211100
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
– name: United States
PublicationSubtitle Techniques for life scientists and chemists
PublicationTitle Nature methods
PublicationTitleAbbrev Nat Methods
PublicationTitleAlternate Nat Methods
PublicationYear 2021
Publisher Nature Publishing Group US
Nature Publishing Group
Publisher_xml – name: Nature Publishing Group US
– name: Nature Publishing Group
References Li (CR51) 2018; 15
Harrow (CR42) 2012; 22
Jain (CR30) 2018; 36
Jain (CR29) 2018; 36
Zook (CR43) 2019; 37
Jain (CR8) 2015; 12
Chin (CR39) 2016; 13
Miga (CR11) 2020; 585
Falconer, Lansdorp (CR6) 2013; 24
Wenger (CR21) 2019; 37
Belton (CR5) 2012; 58
Luo (CR19) 2020; 2
Li (CR53) 2009; 25
CR4
Krusche (CR54) 2019; 37
CR47
CR44
Poplin (CR36) 2018; 36
Edge, Bafna, Bansal (CR37) 2017; 27
McKenna (CR2) 2010; 20
Eid (CR9) 2009; 323
Glusman, Cox, Roach (CR50) 2014; 6
Altshuler (CR1) 2012; 491
Koren (CR40) 2018; 36
Tewhey, Bansal, Torkamani, Topol, Schork (CR48) 2011; 12
Euskirchen (CR33) 2017; 134
Rang, Kloosterman, de Ridder (CR34) 2018; 19
Chin (CR35) 2020; 11
CR18
Fiddes (CR31) 2018; 28
Frankish (CR45) 2019; 47
CR14
CR13
CR57
CR12
Newey (CR56) 1988; 38
CR10
Li, Freudenberg (CR3) 2014; 5
Weisenfeld, Kumar, Shah, Church, Jaffe (CR7) 2017; 27
Zook (CR26) 2016; 3
Eichler, Clark, She (CR32) 2004; 5
Cleary (CR55) 2014; 21
Rodriguez (CR38) 2020; 11
Ruan, Li (CR17) 2020; 17
Heller, Vingron (CR46) 2020; 36
Sedlazeck (CR24) 2018; 15
Ebler, Haukness, Pesout, Marschall, Paten (CR22) 2019; 20
Li (CR52) 2018; 34
CR28
CR27
Edge, Bansal (CR20) 2019; 10
Patterson (CR25) 2015; 22
Kolmogorov, Yuan, Lin, Pevzner (CR16) 2019; 37
Huddleston (CR23) 2017; 27
Porubsky (CR41) 2021; 39
Browning, Browning (CR49) 2011; 12
Nurk (CR15) 2020; 30
D Porubsky (1299_CR41) 2021; 39
JG Cleary (1299_CR55) 2014; 21
1299_CR28
1299_CR27
J Harrow (1299_CR42) 2012; 22
C-S Chin (1299_CR35) 2020; 11
R Tewhey (1299_CR48) 2011; 12
M Jain (1299_CR29) 2018; 36
IT Fiddes (1299_CR31) 2018; 28
AM Wenger (1299_CR21) 2019; 37
R Luo (1299_CR19) 2020; 2
H Li (1299_CR52) 2018; 34
NI Weisenfeld (1299_CR7) 2017; 27
P Krusche (1299_CR54) 2019; 37
1299_CR18
M Jain (1299_CR8) 2015; 12
JM Zook (1299_CR43) 2019; 37
1299_CR14
M Jain (1299_CR30) 2018; 36
1299_CR10
1299_CR13
1299_CR57
1299_CR12
H Li (1299_CR51) 2018; 15
JM Zook (1299_CR26) 2016; 3
OL Rodriguez (1299_CR38) 2020; 11
JM Belton (1299_CR5) 2012; 58
S Koren (1299_CR40) 2018; 36
1299_CR4
J Huddleston (1299_CR23) 2017; 27
1299_CR47
FJ Rang (1299_CR34) 2018; 19
DM Altshuler (1299_CR1) 2012; 491
P Edge (1299_CR20) 2019; 10
C-S Chin (1299_CR39) 2016; 13
P Euskirchen (1299_CR33) 2017; 134
P Edge (1299_CR37) 2017; 27
J Ruan (1299_CR17) 2020; 17
EE Eichler (1299_CR32) 2004; 5
1299_CR44
R Poplin (1299_CR36) 2018; 36
D Heller (1299_CR46) 2020; 36
H Li (1299_CR53) 2009; 25
W Li (1299_CR3) 2014; 5
A Frankish (1299_CR45) 2019; 47
J Ebler (1299_CR22) 2019; 20
A McKenna (1299_CR2) 2010; 20
E Falconer (1299_CR6) 2013; 24
S Nurk (1299_CR15) 2020; 30
SR Browning (1299_CR49) 2011; 12
FJ Sedlazeck (1299_CR24) 2018; 15
KH Miga (1299_CR11) 2020; 585
M Kolmogorov (1299_CR16) 2019; 37
MD Patterson (1299_CR25) 2015; 22
WK Newey (1299_CR56) 1988; 38
G Glusman (1299_CR50) 2014; 6
J Eid (1299_CR9) 2009; 323
References_xml – volume: 323
  start-page: 133
  year: 2009
  end-page: 138
  ident: CR9
  article-title: Real-time DNA sequencing from single polymerase molecules
  publication-title: Science
  doi: 10.1126/science.1162986
– volume: 37
  start-page: 540
  year: 2019
  end-page: 546
  ident: CR16
  article-title: Assembly of long, error-prone reads using repeat graphs
  publication-title: Nat. Biotechnol.
  doi: 10.1038/s41587-019-0072-8
– volume: 3
  start-page: 160025
  year: 2016
  ident: CR26
  article-title: Extensive sequencing of seven human genomes to characterize benchmark reference materials
  publication-title: Sci. Data
  doi: 10.1038/sdata.2016.25
– ident: CR4
– ident: CR12
– volume: 22
  start-page: 498
  year: 2015
  end-page: 509
  ident: CR25
  article-title: WhatsHap: weighted haplotype assembly for future-generation sequencing reads
  publication-title: J. Comput. Biol.
  doi: 10.1089/cmb.2014.0157
– volume: 12
  start-page: 215
  year: 2011
  end-page: 223
  ident: CR48
  article-title: The importance of phase information for human genomics
  publication-title: Nat. Rev. Genet.
  doi: 10.1038/nrg2950
– volume: 585
  start-page: 79
  year: 2020
  end-page: 84
  ident: CR11
  article-title: Telomere-to-telomere assembly of a complete human X chromosome
  publication-title: Nature
  doi: 10.1038/s41586-020-2547-7
– volume: 20
  year: 2019
  ident: CR22
  article-title: Haplotype-aware diplotyping from noisy long reads
  publication-title: Genome Biol.
  doi: 10.1186/s13059-019-1709-0
– volume: 36
  start-page: 321
  year: 2018
  ident: CR30
  article-title: Linear assembly of a human centromere on the Y chromosome
  publication-title: Nat. Biotechnol.
  doi: 10.1038/nbt.4109
– volume: 27
  start-page: 801
  year: 2017
  end-page: 812
  ident: CR37
  article-title: HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies
  publication-title: Genome Res.
  doi: 10.1101/gr.213462.116
– volume: 15
  start-page: 461
  year: 2018
  end-page: 468
  ident: CR24
  article-title: Accurate detection of complex structural variations using single-molecule sequencing
  publication-title: Nat. Methods
  doi: 10.1038/s41592-018-0001-7
– volume: 30
  start-page: 1291
  year: 2020
  end-page: 1305
  ident: CR15
  article-title: HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads
  publication-title: Genome Res.
  doi: 10.1101/gr.263566.120
– volume: 11
  start-page: 1
  year: 2020
  end-page: 9
  ident: CR35
  article-title: A diploid assembly-based benchmark for variants in the major histocompatibility complex
  publication-title: Nat. Commun.
  doi: 10.1038/s41467-020-18564-9
– volume: 5
  start-page: 381
  year: 2014
  ident: CR3
  article-title: Mappability and read length
  publication-title: Front. Genet.
  doi: 10.3389/fgene.2014.00381
– volume: 12
  start-page: 351
  year: 2015
  ident: CR8
  article-title: Improved data analysis for the MinION nanopore sequencer
  publication-title: Nat. Methods
  doi: 10.1038/nmeth.3290
– ident: CR57
– volume: 5
  start-page: 345
  year: 2004
  ident: CR32
  article-title: An assessment of the sequence gaps: unfinished business in a finished human genome
  publication-title: Nat. Rev. Genet.
  doi: 10.1038/nrg1322
– volume: 36
  start-page: 22
  year: 2020
  end-page: 23
  ident: CR46
  article-title: SVIM-asm: Structural variant detection from haploid and diploid genome assemblies.
  publication-title: Bioinformatics
– volume: 19
  year: 2018
  ident: CR34
  article-title: From squiggle to basepair: computational approaches for improving nanopore sequencing read accuracy
  publication-title: Genome Biol.
  doi: 10.1186/s13059-018-1462-9
– volume: 37
  start-page: 561
  year: 2019
  ident: CR43
  article-title: An open resource for accurately benchmarking small variant and reference calls
  publication-title: Nat. Biotechnol.
  doi: 10.1038/s41587-019-0074-6
– volume: 36
  start-page: 1174
  year: 2018
  ident: CR40
  article-title: De novo assembly of haplotype-resolved genomes with trio binning
  publication-title: Nat. Biotechnol.
  doi: 10.1038/nbt.4277
– volume: 58
  start-page: 268
  year: 2012
  end-page: 276
  ident: CR5
  article-title: Hi-C: a comprehensive technique to capture the conformation of genomes
  publication-title: Methods
  doi: 10.1016/j.ymeth.2012.05.001
– volume: 134
  start-page: 691
  year: 2017
  end-page: 703
  ident: CR33
  article-title: Same-day genomic and epigenomic diagnosis of brain tumors using real-time nanopore sequencing
  publication-title: Acta Neuropathol.
  doi: 10.1007/s00401-017-1743-5
– volume: 491
  start-page: 56
  year: 2012
  end-page: 65
  ident: CR1
  article-title: An integrated map of genetic variation from 1,092 human genomes
  publication-title: Nature
  doi: 10.1038/nature11632
– ident: CR18
– volume: 15
  start-page: 595
  year: 2018
  end-page: 597
  ident: CR51
  article-title: A synthetic-diploid benchmark for accurate variant-calling evaluation
  publication-title: Nat. Methods
  doi: 10.1038/s41592-018-0054-7
– ident: CR47
– volume: 27
  start-page: 757
  year: 2017
  end-page: 767
  ident: CR7
  article-title: Direct determination of diploid genome sequences
  publication-title: Genome Res.
  doi: 10.1101/gr.214874.116
– ident: CR14
– volume: 47
  start-page: D766
  year: 2019
  end-page: D773
  ident: CR45
  article-title: GENCODE reference annotation for the human and mouse genomes
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gky955
– volume: 12
  start-page: 703
  year: 2011
  end-page: 714
  ident: CR49
  article-title: Haplotype phasing: existing methods and new developments
  publication-title: Nat. Rev. Genet.
  doi: 10.1038/nrg3054
– volume: 36
  start-page: 983
  year: 2018
  ident: CR36
  article-title: A universal SNP and small-indel variant caller using deep neural networks
  publication-title: Nat. Biotechnol.
  doi: 10.1038/nbt.4235
– volume: 37
  start-page: 1155
  year: 2019
  end-page: 1162
  ident: CR21
  article-title: Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome
  publication-title: Nat. Biotechnol.
  doi: 10.1038/s41587-019-0217-9
– ident: CR10
– volume: 6
  start-page: 1
  year: 2014
  end-page: 16
  ident: CR50
  article-title: Whole-genome haplotyping approaches and genomic medicine
  publication-title: Genome Med.
  doi: 10.1186/s13073-014-0073-7
– volume: 25
  start-page: 2078
  year: 2009
  end-page: 2079
  ident: CR53
  article-title: The sequence alignment/map format and SAMtools
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btp352
– volume: 34
  start-page: 3094
  year: 2018
  end-page: 3100
  ident: CR52
  article-title: Minimap2: pairwise alignment for nucleotide sequences
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bty191
– volume: 38
  start-page: 301
  year: 1988
  end-page: 339
  ident: CR56
  article-title: Adaptive estimation of regression models via moment restrictions
  publication-title: J. Econom.
  doi: 10.1016/0304-4076(88)90048-6
– volume: 36
  start-page: 338
  year: 2018
  ident: CR29
  article-title: Nanopore sequencing and assembly of a human genome with ultra-long reads
  publication-title: Nat. Biotechnol.
  doi: 10.1038/nbt.4060
– ident: CR27
– volume: 13
  start-page: 1050
  year: 2016
  ident: CR39
  article-title: Phased diploid genome assembly with single-molecule real-time sequencing
  publication-title: Nat. Methods
  doi: 10.1038/nmeth.4035
– volume: 37
  start-page: 555
  year: 2019
  end-page: 560
  ident: CR54
  article-title: Best practices for benchmarking germline small-variant calls in human genomes
  publication-title: Nat. Biotechnol.
  doi: 10.1038/s41587-019-0054-x
– volume: 22
  start-page: 1760
  year: 2012
  end-page: 1774
  ident: CR42
  article-title: GENCODE: The reference human genome annotation for The ENCODE Project
  publication-title: Genome Res.
  doi: 10.1101/gr.135350.111
– ident: CR44
– volume: 28
  start-page: 1029
  year: 2018
  end-page: 1038
  ident: CR31
  article-title: Comparative Annotation Toolkit (CAT)—simultaneous clade and personal genome annotation
  publication-title: Genome Res.
  doi: 10.1101/gr.233460.117
– volume: 39
  start-page: 302
  year: 2021
  end-page: 308
  ident: CR41
  article-title: Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads
  publication-title: Nat. Biotechnol.
  doi: 10.1038/s41587-020-0719-5
– volume: 27
  start-page: 677
  year: 2017
  end-page: 685
  ident: CR23
  article-title: Discovery and genotyping of structural variation from long-read haploid genome sequence data
  publication-title: Genome Res.
  doi: 10.1101/gr.214007.116
– volume: 21
  start-page: 405
  year: 2014
  end-page: 419
  ident: CR55
  article-title: Joint variant and de novo mutation identification on pedigrees from high-throughput sequencing data
  publication-title: J. Comput. Biol.
  doi: 10.1089/cmb.2014.0029
– volume: 20
  start-page: 1297
  year: 2010
  end-page: 1303
  ident: CR2
  article-title: The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data
  publication-title: Genome Res.
  doi: 10.1101/gr.107524.110
– volume: 2
  start-page: 220
  year: 2020
  end-page: 227
  ident: CR19
  article-title: Exploring the limit of using a deep neural network on pileup data for germline variant calling
  publication-title: Nat. Mach. Intell.
  doi: 10.1038/s42256-020-0167-4
– ident: CR13
– volume: 17
  start-page: 155
  year: 2020
  end-page: 158
  ident: CR17
  article-title: Fast and accurate long-read assembly with wtdbg2
  publication-title: Nat. Methods
  doi: 10.1038/s41592-019-0669-3
– volume: 10
  start-page: 1
  year: 2019
  end-page: 10
  ident: CR20
  article-title: Longshot enables accurate variant calling in diploid genomes from single-molecule long read sequencing
  publication-title: Nat. Commun.
  doi: 10.1038/s41467-019-12493-y
– ident: CR28
– volume: 11
  start-page: 2136
  year: 2020
  ident: CR38
  article-title: A novel framework for characterizing genomic haplotype diversity in the human immunoglobulin heavy chain locus
  publication-title: Front. Immunol.
  doi: 10.3389/fimmu.2020.02136
– volume: 24
  start-page: 643
  year: 2013
  end-page: 652
  ident: CR6
  article-title: Strand-seq: a unifying tool for studies of chromosome segregation
  publication-title: Semin. Cell Developmental Biol.
  doi: 10.1016/j.semcdb.2013.04.005
– volume: 17
  start-page: 155
  year: 2020
  ident: 1299_CR17
  publication-title: Nat. Methods
  doi: 10.1038/s41592-019-0669-3
– volume: 12
  start-page: 703
  year: 2011
  ident: 1299_CR49
  publication-title: Nat. Rev. Genet.
  doi: 10.1038/nrg3054
– volume: 30
  start-page: 1291
  year: 2020
  ident: 1299_CR15
  publication-title: Genome Res.
  doi: 10.1101/gr.263566.120
– volume: 12
  start-page: 351
  year: 2015
  ident: 1299_CR8
  publication-title: Nat. Methods
  doi: 10.1038/nmeth.3290
– ident: 1299_CR27
  doi: 10.1101/2020.07.24.212712
– volume: 585
  start-page: 79
  year: 2020
  ident: 1299_CR11
  publication-title: Nature
  doi: 10.1038/s41586-020-2547-7
– volume: 28
  start-page: 1029
  year: 2018
  ident: 1299_CR31
  publication-title: Genome Res.
  doi: 10.1101/gr.233460.117
– volume: 11
  start-page: 1
  year: 2020
  ident: 1299_CR35
  publication-title: Nat. Commun.
  doi: 10.1038/s41467-020-18564-9
– volume: 27
  start-page: 801
  year: 2017
  ident: 1299_CR37
  publication-title: Genome Res.
  doi: 10.1101/gr.213462.116
– ident: 1299_CR13
  doi: 10.1038/s41587-020-0503-6
– volume: 25
  start-page: 2078
  year: 2009
  ident: 1299_CR53
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btp352
– volume: 13
  start-page: 1050
  year: 2016
  ident: 1299_CR39
  publication-title: Nat. Methods
  doi: 10.1038/nmeth.4035
– volume: 36
  start-page: 321
  year: 2018
  ident: 1299_CR30
  publication-title: Nat. Biotechnol.
  doi: 10.1038/nbt.4109
– volume: 27
  start-page: 757
  year: 2017
  ident: 1299_CR7
  publication-title: Genome Res.
  doi: 10.1101/gr.214874.116
– volume: 21
  start-page: 405
  year: 2014
  ident: 1299_CR55
  publication-title: J. Comput. Biol.
  doi: 10.1089/cmb.2014.0029
– volume: 27
  start-page: 677
  year: 2017
  ident: 1299_CR23
  publication-title: Genome Res.
  doi: 10.1101/gr.214007.116
– ident: 1299_CR28
  doi: 10.1101/2020.11.13.380741
– ident: 1299_CR47
  doi: 10.1038/s41587-020-0538-8
– volume: 36
  start-page: 22
  year: 2020
  ident: 1299_CR46
  publication-title: Bioinformatics
– ident: 1299_CR12
  doi: 10.1038/s41586-021-03420-7
– ident: 1299_CR44
  doi: 10.1101/2020.12.11.422022
– volume: 12
  start-page: 215
  year: 2011
  ident: 1299_CR48
  publication-title: Nat. Rev. Genet.
  doi: 10.1038/nrg2950
– volume: 5
  start-page: 381
  year: 2014
  ident: 1299_CR3
  publication-title: Front. Genet.
  doi: 10.3389/fgene.2014.00381
– volume: 37
  start-page: 561
  year: 2019
  ident: 1299_CR43
  publication-title: Nat. Biotechnol.
  doi: 10.1038/s41587-019-0074-6
– volume: 15
  start-page: 595
  year: 2018
  ident: 1299_CR51
  publication-title: Nat. Methods
  doi: 10.1038/s41592-018-0054-7
– volume: 323
  start-page: 133
  year: 2009
  ident: 1299_CR9
  publication-title: Science
  doi: 10.1126/science.1162986
– volume: 39
  start-page: 302
  year: 2021
  ident: 1299_CR41
  publication-title: Nat. Biotechnol.
  doi: 10.1038/s41587-020-0719-5
– volume: 20
  year: 2019
  ident: 1299_CR22
  publication-title: Genome Biol.
  doi: 10.1186/s13059-019-1709-0
– volume: 5
  start-page: 345
  year: 2004
  ident: 1299_CR32
  publication-title: Nat. Rev. Genet.
  doi: 10.1038/nrg1322
– volume: 6
  start-page: 1
  year: 2014
  ident: 1299_CR50
  publication-title: Genome Med.
  doi: 10.1186/s13073-014-0073-7
– volume: 37
  start-page: 1155
  year: 2019
  ident: 1299_CR21
  publication-title: Nat. Biotechnol.
  doi: 10.1038/s41587-019-0217-9
– ident: 1299_CR4
– volume: 47
  start-page: D766
  year: 2019
  ident: 1299_CR45
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gky955
– volume: 20
  start-page: 1297
  year: 2010
  ident: 1299_CR2
  publication-title: Genome Res.
  doi: 10.1101/gr.107524.110
– volume: 37
  start-page: 555
  year: 2019
  ident: 1299_CR54
  publication-title: Nat. Biotechnol.
  doi: 10.1038/s41587-019-0054-x
– volume: 19
  year: 2018
  ident: 1299_CR34
  publication-title: Genome Biol.
  doi: 10.1186/s13059-018-1462-9
– volume: 3
  start-page: 160025
  year: 2016
  ident: 1299_CR26
  publication-title: Sci. Data
  doi: 10.1038/sdata.2016.25
– volume: 36
  start-page: 1174
  year: 2018
  ident: 1299_CR40
  publication-title: Nat. Biotechnol.
  doi: 10.1038/nbt.4277
– volume: 24
  start-page: 643
  year: 2013
  ident: 1299_CR6
  publication-title: Semin. Cell Developmental Biol.
  doi: 10.1016/j.semcdb.2013.04.005
– ident: 1299_CR57
  doi: 10.5281/zenodo.5275510
– volume: 11
  start-page: 2136
  year: 2020
  ident: 1299_CR38
  publication-title: Front. Immunol.
  doi: 10.3389/fimmu.2020.02136
– volume: 22
  start-page: 1760
  year: 2012
  ident: 1299_CR42
  publication-title: Genome Res.
  doi: 10.1101/gr.135350.111
– volume: 134
  start-page: 691
  year: 2017
  ident: 1299_CR33
  publication-title: Acta Neuropathol.
  doi: 10.1007/s00401-017-1743-5
– volume: 10
  start-page: 1
  year: 2019
  ident: 1299_CR20
  publication-title: Nat. Commun.
  doi: 10.1038/s41467-019-12493-y
– volume: 34
  start-page: 3094
  year: 2018
  ident: 1299_CR52
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bty191
– volume: 22
  start-page: 498
  year: 2015
  ident: 1299_CR25
  publication-title: J. Comput. Biol.
  doi: 10.1089/cmb.2014.0157
– volume: 37
  start-page: 540
  year: 2019
  ident: 1299_CR16
  publication-title: Nat. Biotechnol.
  doi: 10.1038/s41587-019-0072-8
– volume: 36
  start-page: 983
  year: 2018
  ident: 1299_CR36
  publication-title: Nat. Biotechnol.
  doi: 10.1038/nbt.4235
– ident: 1299_CR18
– ident: 1299_CR14
  doi: 10.1038/s41592-020-01056-5
– ident: 1299_CR10
  doi: 10.1101/2020.11.01.363887
– volume: 491
  start-page: 56
  year: 2012
  ident: 1299_CR1
  publication-title: Nature
  doi: 10.1038/nature11632
– volume: 58
  start-page: 268
  year: 2012
  ident: 1299_CR5
  publication-title: Methods
  doi: 10.1016/j.ymeth.2012.05.001
– volume: 36
  start-page: 338
  year: 2018
  ident: 1299_CR29
  publication-title: Nat. Biotechnol.
  doi: 10.1038/nbt.4060
– volume: 38
  start-page: 301
  year: 1988
  ident: 1299_CR56
  publication-title: J. Econom.
  doi: 10.1016/0304-4076(88)90048-6
– volume: 2
  start-page: 220
  year: 2020
  ident: 1299_CR19
  publication-title: Nat. Mach. Intell.
  doi: 10.1038/s42256-020-0167-4
– volume: 15
  start-page: 461
  year: 2018
  ident: 1299_CR24
  publication-title: Nat. Methods
  doi: 10.1038/s41592-018-0001-7
SSID ssj0033425
Score 2.6622128
Snippet Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent...
SourceID pubmedcentral
proquest
gale
pubmed
crossref
springer
SourceType Open Access Repository
Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 1322
SubjectTerms 631/114/1305
631/114/2785
631/1647/794
631/208/726/649
631/61/212
Bioinformatics
Biological Microscopy
Biological Techniques
Biomedical and Life Sciences
Biomedical Engineering/Biotechnology
Diploids
DNA sequencing
Genes
Genetic research
Genetic variation
Genome, Human
Genomes
Genomics
Genotyping
Haplotypes
High-Throughput Nucleotide Sequencing - methods
Humans
Identification methods
Life Sciences
Methods
Molecular Sequence Annotation
Nanopores
Nanotechnology
Nucleotide sequencing
Nucleotides
Pipelines
Polymorphism, Single Nucleotide
Proteomics
Sequence Analysis, DNA - methods
Software
Vegetables
Title Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads
URI https://link.springer.com/article/10.1038/s41592-021-01299-w
https://www.ncbi.nlm.nih.gov/pubmed/34725481
https://www.proquest.com/docview/2592764350
https://www.proquest.com/docview/2592311163
https://pubmed.ncbi.nlm.nih.gov/PMC8571015
Volume 18
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3db9MwELdgExIviG8yxmQkJB7AWhPbsfuEWmipkKiqiaG-RY7tjIoq6ZqWav89d0makUrsJXnwJbJ9Z9-dffc7Qt71Mpk58KWZ4NoxYX3IjE01c30juIjTWFe1Ab9P48ml-DaX8-bArWzCKvd7YrVRu8LiGfk5mOmRAvUpe59W1wyrRuHtalNC4z45RugyDOlS89bh4lxURVfRKmcKFGOTNNPj-rwExYVxlxE607Als11HMR1uz__op8PYyYML1EovjR-TR41BSQe1BDwh93z-lDyoS0zePCO_J2a1LPCglZmdWXv6B5xjmE0KvMFMdIoHsXQ2ms1GFwyr3i5y9sX71c-GzFe5VSVFWGNqrN2ujb2hi5zmJi_Advd0WeRXDExPVz4nl-PRj88T1hRYYFaKcMMyFauw772XLnU2C6VNnTQ6i9Au8zI0UaxSngmhrOjZfsal42GUichFCPun-QtylBe5f0Uo1zqzEvHPFKKI-dSpOHXcg3XnrXA8IOF-dhPboI9jEYxlUt2Cc53UHEmAI0nFkWQXkA_tN6sae-NO6vfItAQXJvzZmia_APqHEFfJAJHWdAz9DshphxIWlO0279meNAu6TG7FLyBv22b8EoPUcl9saxoOuiOG0b6spaTtNxcKXHEdBkR15KclQJjvbku--FXBfWsJVmAoA_JxL2m33fr_dJzcPYrX5GGEwl9lVZ6So81669-AebVJz6o1BE89_npGjgfj4XAK7-FoOrv4C0IyJEs
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1bb9MwFLbGEIIXxJ3AACOBeABrTWwn7gNCE2vp2EUV2tDejGM7UFElpReq_il-I-fk0pFK7G3PPrEcn7vt8x1CXnUymTnIpZngyjFhfciMTRVzXSO4iNNYlb0Bj0_iwZn4fC7Pt8ifphYGn1U2NrE01K6weEa-C2F6lID7lJ0Pk18Mu0bh7WrTQqMSi0O_WkLKNnt_sA_8fR1F_d7pxwGruwowK0U4Z1kSJ2HXey9d6mwWSps6aVQWYTDiZWiiOEl5JkRiRcd2My4dD6NMRC5CrDvFYd5r5LrgvIsapfqfGsvPuSibvGIWwBJwxHWRToer3Rk4SnznGWHyDi6ALVuOcNMd_OMPN99qblzYln6wf4fcrgNYuldJ3F2y5fN75EbV0nJ1n_wcmMm4wINdZpZm6ulvSMaBexRkASvfKR780mFvOOx9Ydhld5Szfe8nX2syX9ZyzSjCKFNj7WJq7IqOcpqbvIBcwdNxkX9nEOq62QNydiVb_5Bs50XuHxPKlcqsRLy1BFHLfOqSOHXcQzTprXA8IGGzu9rWaOfYdGOsy1t3rnTFEQ0c0SVH9DIgb9ffTCqsj0up3yDTNBoCmNmaup4B1oeQWnoPkd1UDOsOyE6LEhTYtocbtuvagMz0hbgH5OV6GL_ER3G5LxYVDQdfFcPfPqqkZL1uLhJI_VUYkKQlP2sChBVvj-SjHyW8uJIQdYYyIO8aSbtY1v-348nlf_GC3BycHh_po4OTw6fkVoSKUFZ07pDt-XThn0FoN0-fl_pEyberVuC_sc5dBw
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LbxMxELZKEYgL4s1CASOBOICVrB9r54BQRRKlFKoIUZTbsmt724hoN-RBlL_Gr2NmHymJRG89e9byemY8M_bMN4S8amcqcxBLMymMY9L6kCU2Ncx1EilklEam7A345SQanMpPIzXaI3-aWhhMq2zOxPKgdoXFO_IWuOlcg_lU7VZWp0UMu_0P018MO0jhS2vTTqMSkWO_XkH4Nn9_1AVev-a83_v2ccDqDgPMKhkuWKYjHXa898qlzmahsqlTick4OiZehQmPdCoyKbWVbdvJhHIi5JnkjiPunREw7zVyXQsVoo7p0SbYE0KWDV8xImAajHJdsNMWpjUHo4k5nxwDeTAHbLVlFHdNwz-2cTdvc-fxtrSJ_Tvkdu3M0sNK-u6SPZ_fIzeq9pbr--TnIJlOCrzkZckqmXn6GwJz4CQFucAqeIqXwHTYGw57Xxl23B3nrOv99HtN5su6rjlFSGWaWLucJXZNxznNk7yAuMHTSZGfMXB73fwBOb2SrX9I9vMi948JFcZkViH2mkYEM586HaVOePAsvZVOBCRsdje2NfI5NuCYxOULvDBxxZEYOBKXHIlXAXm7-WZa4X5cSv0GmRbjoQAz26SubYD1IbxWfIgobyaCdQfkYIsSlNluDzdsj-vDZB5fiH5AXm6G8UtMkMt9saxoBNitCP72USUlm3ULqTmIYRgQvSU_GwKEGN8eycfnJdS4UeCBhiog7xpJu1jW_7fjyeV_8YLcBNWNPx-dHD8ltzjqQVnceUD2F7OlfwZe3iJ9XqoTJT-uWn__AgeTYTQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Haplotype-aware+variant+calling+with+PEPPER-Margin-DeepVariant+enables+high+accuracy+in+nanopore+long-reads&rft.jtitle=Nature+methods&rft.au=Shafin%2C+Kishwar&rft.au=Pesout%2C+Trevor&rft.au=Chang%2C+Pi-Chuan&rft.au=Nattestad%2C+Maria&rft.date=2021-11-01&rft.eissn=1548-7105&rft.volume=18&rft.issue=11&rft.spage=1322&rft_id=info:doi/10.1038%2Fs41592-021-01299-w&rft_id=info%3Apmid%2F34725481&rft.externalDocID=34725481
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1548-7091&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1548-7091&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1548-7091&client=summon