Leveraging known genomic variants to improve detection of variants, especially close-by Indels

Abstract Motivation The detection of genomic variants has great significance in genomics, bioinformatics, biomedical research and its applications. However, despite a lot of effort, Indels and structural variants are still under-characterized compared to SNPs. Current approaches based on next-genera...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics Vol. 34; no. 17; pp. 2918 - 2926
Main Authors Vo, Nam S, Phan, Vinhthuy
Format Journal Article
LanguageEnglish
Published England Oxford University Press 01.09.2018
Online AccessGet full text
ISSN1367-4803
1367-4811
1460-2059
1367-4811
DOI10.1093/bioinformatics/bty183

Cover

Abstract Abstract Motivation The detection of genomic variants has great significance in genomics, bioinformatics, biomedical research and its applications. However, despite a lot of effort, Indels and structural variants are still under-characterized compared to SNPs. Current approaches based on next-generation sequencing data usually require large numbers of reads (high coverage) to be able to detect such types of variants accurately. However Indels, especially those close to each other, are still hard to detect accurately. Results We introduce a novel approach that leverages known variant information, e.g. provided by dbSNP, dbVar, ExAC or the 1000 Genomes Project, to improve sensitivity of detecting variants, especially close-by Indels. In our approach, the standard reference genome and the known variants are combined to build a meta-reference, which is expected to be probabilistically closer to the subject genomes than the standard reference. An alignment algorithm, which can take into account known variant information, is developed to accurately align reads to the meta-reference. This strategy resulted in accurate alignment and variant calling even with low coverage data. We showed that compared to popular methods such as GATK and SAMtools, our method significantly improves the sensitivity of detecting variants, especially Indels that are close to each other. In particular, our method was able to call these close-by Indels at a 15-20% higher sensitivity than other methods at low coverage, and still get 1-5% higher sensitivity at high coverage, at competitive precision. These results were validated using simulated data with variant profiles extracted from the 1000 Genomes Project data, and real data from the Illumina Platinum Genomes Project and ExAC database. Our finding suggests that by incorporating known variant information in an appropriate manner, sensitive variant calling is possible at a low cost. Availability and implementation Implementation can be found in our public code repository https://github.com/namsyvo/IVC. Supplementary information Supplementary data are available at Bioinformatics online.
AbstractList Abstract Motivation The detection of genomic variants has great significance in genomics, bioinformatics, biomedical research and its applications. However, despite a lot of effort, Indels and structural variants are still under-characterized compared to SNPs. Current approaches based on next-generation sequencing data usually require large numbers of reads (high coverage) to be able to detect such types of variants accurately. However Indels, especially those close to each other, are still hard to detect accurately. Results We introduce a novel approach that leverages known variant information, e.g. provided by dbSNP, dbVar, ExAC or the 1000 Genomes Project, to improve sensitivity of detecting variants, especially close-by Indels. In our approach, the standard reference genome and the known variants are combined to build a meta-reference, which is expected to be probabilistically closer to the subject genomes than the standard reference. An alignment algorithm, which can take into account known variant information, is developed to accurately align reads to the meta-reference. This strategy resulted in accurate alignment and variant calling even with low coverage data. We showed that compared to popular methods such as GATK and SAMtools, our method significantly improves the sensitivity of detecting variants, especially Indels that are close to each other. In particular, our method was able to call these close-by Indels at a 15-20% higher sensitivity than other methods at low coverage, and still get 1-5% higher sensitivity at high coverage, at competitive precision. These results were validated using simulated data with variant profiles extracted from the 1000 Genomes Project data, and real data from the Illumina Platinum Genomes Project and ExAC database. Our finding suggests that by incorporating known variant information in an appropriate manner, sensitive variant calling is possible at a low cost. Availability and implementation Implementation can be found in our public code repository https://github.com/namsyvo/IVC. Supplementary information Supplementary data are available at Bioinformatics online.
The detection of genomic variants has great significance in genomics, bioinformatics, biomedical research and its applications. However, despite a lot of effort, Indels and structural variants are still under-characterized compared to SNPs. Current approaches based on next-generation sequencing data usually require large numbers of reads (high coverage) to be able to detect such types of variants accurately. However Indels, especially those close to each other, are still hard to detect accurately. We introduce a novel approach that leverages known variant information, e.g. provided by dbSNP, dbVar, ExAC or the 1000 Genomes Project, to improve sensitivity of detecting variants, especially close-by Indels. In our approach, the standard reference genome and the known variants are combined to build a meta-reference, which is expected to be probabilistically closer to the subject genomes than the standard reference. An alignment algorithm, which can take into account known variant information, is developed to accurately align reads to the meta-reference. This strategy resulted in accurate alignment and variant calling even with low coverage data. We showed that compared to popular methods such as GATK and SAMtools, our method significantly improves the sensitivity of detecting variants, especially Indels that are close to each other. In particular, our method was able to call these close-by Indels at a 15-20% higher sensitivity than other methods at low coverage, and still get 1-5% higher sensitivity at high coverage, at competitive precision. These results were validated using simulated data with variant profiles extracted from the 1000 Genomes Project data, and real data from the Illumina Platinum Genomes Project and ExAC database. Our finding suggests that by incorporating known variant information in an appropriate manner, sensitive variant calling is possible at a low cost. Implementation can be found in our public code repository https://github.com/namsyvo/IVC. Supplementary data are available at Bioinformatics online.
The detection of genomic variants has great significance in genomics, bioinformatics, biomedical research and its applications. However, despite a lot of effort, Indels and structural variants are still under-characterized compared to SNPs. Current approaches based on next-generation sequencing data usually require large numbers of reads (high coverage) to be able to detect such types of variants accurately. However Indels, especially those close to each other, are still hard to detect accurately.MotivationThe detection of genomic variants has great significance in genomics, bioinformatics, biomedical research and its applications. However, despite a lot of effort, Indels and structural variants are still under-characterized compared to SNPs. Current approaches based on next-generation sequencing data usually require large numbers of reads (high coverage) to be able to detect such types of variants accurately. However Indels, especially those close to each other, are still hard to detect accurately.We introduce a novel approach that leverages known variant information, e.g. provided by dbSNP, dbVar, ExAC or the 1000 Genomes Project, to improve sensitivity of detecting variants, especially close-by Indels. In our approach, the standard reference genome and the known variants are combined to build a meta-reference, which is expected to be probabilistically closer to the subject genomes than the standard reference. An alignment algorithm, which can take into account known variant information, is developed to accurately align reads to the meta-reference. This strategy resulted in accurate alignment and variant calling even with low coverage data. We showed that compared to popular methods such as GATK and SAMtools, our method significantly improves the sensitivity of detecting variants, especially Indels that are close to each other. In particular, our method was able to call these close-by Indels at a 15-20% higher sensitivity than other methods at low coverage, and still get 1-5% higher sensitivity at high coverage, at competitive precision. These results were validated using simulated data with variant profiles extracted from the 1000 Genomes Project data, and real data from the Illumina Platinum Genomes Project and ExAC database. Our finding suggests that by incorporating known variant information in an appropriate manner, sensitive variant calling is possible at a low cost.ResultsWe introduce a novel approach that leverages known variant information, e.g. provided by dbSNP, dbVar, ExAC or the 1000 Genomes Project, to improve sensitivity of detecting variants, especially close-by Indels. In our approach, the standard reference genome and the known variants are combined to build a meta-reference, which is expected to be probabilistically closer to the subject genomes than the standard reference. An alignment algorithm, which can take into account known variant information, is developed to accurately align reads to the meta-reference. This strategy resulted in accurate alignment and variant calling even with low coverage data. We showed that compared to popular methods such as GATK and SAMtools, our method significantly improves the sensitivity of detecting variants, especially Indels that are close to each other. In particular, our method was able to call these close-by Indels at a 15-20% higher sensitivity than other methods at low coverage, and still get 1-5% higher sensitivity at high coverage, at competitive precision. These results were validated using simulated data with variant profiles extracted from the 1000 Genomes Project data, and real data from the Illumina Platinum Genomes Project and ExAC database. Our finding suggests that by incorporating known variant information in an appropriate manner, sensitive variant calling is possible at a low cost.Implementation can be found in our public code repository https://github.com/namsyvo/IVC.Availability and implementationImplementation can be found in our public code repository https://github.com/namsyvo/IVC.Supplementary data are available at Bioinformatics online.Supplementary informationSupplementary data are available at Bioinformatics online.
Author Vo, Nam S
Phan, Vinhthuy
Author_xml – sequence: 1
  givenname: Nam S
  surname: Vo
  fullname: Vo, Nam S
  email: vosynam@gmail.com
  organization: Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
– sequence: 2
  givenname: Vinhthuy
  surname: Phan
  fullname: Phan, Vinhthuy
  organization: Department of Computer Science, The University of Memphis, Memphis, TN, USA
BackLink https://www.ncbi.nlm.nih.gov/pubmed/29590294$$D View this record in MEDLINE/PubMed
BookMark eNqNkc1uEzEUhS1URH_gEYq8ZMHQ67_MWF2hqrSVIrGBLZbHvhO5zNip7aTK2zNR2kplA6vjxffZvveckqOYIhJyzuALAy0u-pBCHFKebA2uXPR1xzrxhpwwuYCGg9JH81ks2kZ2II7JaSn3AIpJKd-RY66VBq7lCfm1xC1muwpxRX_H9BjpCmOagqNbm4ONtdCaaJjWOW2ReqzoakiRpuEF-EyxrNEFO4476sZUsOl39C56HMt78nawY8EPT3lGfn67_nF12yy_39xdfV02TgqoDRODdtzrObnWnbdeSq-h61poneVKSdsysACa9a1VTi087xhzQjPuUYA4I58O987_fNhgqWYKxeE42ohpUwwHpjvGFezRj0_opp_Qm3UOk80787yTGbg8AC6nUjIOxoVq91PXbMNoGJh9A-Z1A-bQwGyrv-znB_7lwcFLm_V_Kn8A12uj2A
CitedBy_id crossref_primary_10_1590_1678_4685_gmb_2020_0047
Cites_doi 10.1089/cmb.2011.0201
10.1093/bioinformatics/btp394
10.1093/bioinformatics/btr509
10.1101/gr.088013.108
10.1093/bioinformatics/btt556
10.1093/bioinformatics/bts414
10.1038/nature13907
10.1101/gr.112326.110
10.1038/nature15393
10.1186/1471-2105-13-8
10.1093/bib/bbq015
10.1038/nmeth.3069
10.1101/gr.146084.112
10.1126/science.1216872
10.1038/nature11632
10.1038/ng.806
10.1186/1471-2105-13-185
10.1093/nar/gkv677
10.1093/bioinformatics/btu356
10.1038/nbt.2835
10.1186/gb-2009-10-9-r98
10.1371/journal.pone.0075619
10.1101/gr.096388.109
10.1101/gr.100040.109
10.1038/nmeth.1363
10.1093/nar/gks1213
10.1186/1471-2105-14-274
10.1101/gr.107524.110
10.1155/2015/456479
10.1007/978-3-642-21458-5_5
10.1186/1471-2164-15-S5-S2
10.1093/bioinformatics/btt215
10.1093/nar/gkl1031
10.1038/nature19057
10.1093/bioinformatics/btu376
10.1093/bioinformatics/btv440
10.1093/bioinformatics/btp352
10.1093/bib/bbs086
10.1145/1082036.1082039
ContentType Journal Article
Copyright The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2018
Copyright_xml – notice: The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2018
DBID AAYXX
CITATION
NPM
7X8
DOI 10.1093/bioinformatics/bty183
DatabaseName CrossRef
PubMed
MEDLINE - Academic
DatabaseTitle CrossRef
PubMed
MEDLINE - Academic
DatabaseTitleList
PubMed
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1460-2059
1367-4811
EndPage 2926
ExternalDocumentID 29590294
10_1093_bioinformatics_bty183
10.1093/bioinformatics/bty183
Genre Research Support, U.S. Gov't, Non-P.H.S
Journal Article
GrantInformation_xml – fundername: National Science Foundation Computing and Communication Foundations
  grantid: NSF CCF-1320297
GroupedDBID -~X
.2P
5GY
AAMVS
ABJNI
ABPTD
ACGFS
ADZXQ
ALMA_UNASSIGNED_HOLDINGS
F5P
HW0
Q5Y
RD5
ROZ
TLC
TN5
TOX
WH7
---
-E4
.DC
.I3
0R~
23N
2WC
4.4
48X
53G
5WA
70D
AAIJN
AAIMJ
AAJKP
AAJQQ
AAKPC
AAMDB
AAOGV
AAPQZ
AAPXW
AAUQX
AAVAP
AAVLN
AAYXX
ABEJV
ABEUO
ABGNP
ABIXL
ABNKS
ABPQP
ABQLI
ABWST
ABXVV
ABZBJ
ACIWK
ACPRK
ACUFI
ACUXJ
ACYTK
ADBBV
ADEYI
ADEZT
ADFTL
ADGKP
ADGZP
ADHKW
ADHZD
ADMLS
ADOCK
ADPDF
ADRDM
ADRTK
ADVEK
ADYVW
ADZTZ
AECKG
AEGPL
AEJOX
AEKKA
AEKSI
AELWJ
AEMDU
AENEX
AENZO
AEPUE
AETBJ
AEWNT
AFFZL
AFGWE
AFIYH
AFOFC
AFRAH
AGINJ
AGKEF
AGQXC
AGSYK
AHMBA
AHXPO
AIJHB
AJEEA
AJEUX
AKHUL
AKWXX
ALTZX
ALUQC
AMNDL
APIBT
APWMN
ARIXL
ASPBG
AVWKF
AXUDD
AYOIW
AZVOD
BAWUL
BAYMD
BHONS
BQDIO
BQUQU
BSWAC
BTQHN
C45
CDBKE
CITATION
CS3
CZ4
DAKXR
DIK
DILTD
DU5
D~K
EBD
EBS
EE~
EJD
EMOBN
F9B
FEDTE
FHSFR
FLIZI
FLUFQ
FOEOM
FQBLK
GAUVT
GJXCC
GROUPED_DOAJ
GX1
H13
H5~
HAR
HZ~
IOX
J21
JXSIZ
KAQDR
KOP
KQ8
KSI
KSN
M-Z
MK~
ML0
N9A
NGC
NLBLG
NMDNZ
NOMLY
NU-
O9-
OAWHX
ODMLO
OJQWA
OK1
OVD
OVEED
P2P
PAFKI
PEELM
PQQKQ
Q1.
R44
RNS
ROL
RPM
RUSNO
RW1
RXO
SV3
TEORI
TJP
TR2
W8F
WOQ
X7H
YAYTL
YKOAZ
YXANX
ZKX
~91
~KM
ADRIX
AFXEN
BCRHZ
M49
NPM
ROX
7X8
ID FETCH-LOGICAL-c430t-13f9c2d913f2998dad44d9088707ca2554a710a0091b7a5c56d2811c3912de303
IEDL.DBID TOX
ISSN 1367-4803
1367-4811
IngestDate Fri Jul 11 08:32:33 EDT 2025
Wed Feb 19 02:32:21 EST 2025
Thu Apr 24 23:12:27 EDT 2025
Tue Jul 01 03:27:25 EDT 2025
Wed Apr 02 07:03:16 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 17
Language English
License This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c430t-13f9c2d913f2998dad44d9088707ca2554a710a0091b7a5c56d2811c3912de303
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://academic.oup.com/bioinformatics/article-pdf/34/17/2918/25702995/bty183.pdf
PMID 29590294
PQID 2019812500
PQPubID 23479
PageCount 9
ParticipantIDs proquest_miscellaneous_2019812500
pubmed_primary_29590294
crossref_citationtrail_10_1093_bioinformatics_bty183
crossref_primary_10_1093_bioinformatics_bty183
oup_primary_10_1093_bioinformatics_bty183
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 20180901
2018-09-01
PublicationDateYYYYMMDD 2018-09-01
PublicationDate_xml – month: 09
  year: 2018
  text: 20180901
  day: 01
PublicationDecade 2010
PublicationPlace England
PublicationPlace_xml – name: England
PublicationTitle Bioinformatics
PublicationTitleAlternate Bioinformatics
PublicationYear 2018
Publisher Oxford University Press
Publisher_xml – name: Oxford University Press
References 1000 Genomes Project Consortium (2023061313371354500_bty183-B2) 2015; 526
Garrison (2023061313371354500_bty183-B13) 2012
Chaisson (2023061313371354500_bty183-B7) 2014; 517
Challis (2023061313371354500_bty183-B8) 2012; 13
Lek (2023061313371354500_bty183-B17) 2016; 536
Huang (2023061313371354500_bty183-B14) 2013; 29
Li (2023061313371354500_bty183-B21) 2015; 31
Shen (2023061313371354500_bty183-B33) 2010; 20
Thachuk (2023061313371354500_bty183-B34) 2011
Li (2023061313371354500_bty183-B24) 2010; 11
Mose (2023061313371354500_bty183-B29) 2014; 30
Schneeberger (2023061313371354500_bty183-B32) 2009; 10
Pabinger (2023061313371354500_bty183-B31) 2014; 15
Li (2023061313371354500_bty183-B22) 2009; 25
Lappalainen (2023061313371354500_bty183-B16) 2013; 41
Li (2023061313371354500_bty183-B19) 2013
Ye (2023061313371354500_bty183-B39) 2009; 25
Yu (2023061313371354500_bty183-B40) 2013; 14
Vo (2023061313371354500_bty183-B35) 2014; 15
Carnevali (2023061313371354500_bty183-B6) 2012; 19
1000 Genomes Project Consortium (2023061313371354500_bty183-B1) 2012; 491
Wang (2023061313371354500_bty183-B37) 2013; 23
DePristo (2023061313371354500_bty183-B11) 2011; 43
Li (2023061313371354500_bty183-B20) 2014; 30
Zook (2023061313371354500_bty183-B41) 2014; 32
Bansal (2023061313371354500_bty183-B5) 2010; 20
Chen (2023061313371354500_bty183-B9) 2009; 6
Li (2023061313371354500_bty183-B23) 2009; 19
Liu (2023061313371354500_bty183-B26) 2012; 28
Ferragina (2023061313371354500_bty183-B12) 2005; 52
Li (2023061313371354500_bty183-B18) 2011; 27
Auton (2023061313371354500_bty183-B4) 2012; 336
McKenna (2023061313371354500_bty183-B28) 2010; 20
Liu (2023061313371354500_bty183-B25) 2013; 8
Albers (2023061313371354500_bty183-B3) 2011; 21
Jiang (2023061313371354500_bty183-B15) 2015; 43
Wang (2023061313371354500_bty183-B36) 2012; 13
Wheeler (2023061313371354500_bty183-B38) 2007; 35
Narzisi (2023061313371354500_bty183-B30) 2014; 11
Marschall (2023061313371354500_bty183-B27) 2013; 29
Cornish (2023061313371354500_bty183-B10) 2015; 2015
References_xml – volume: 19
  start-page: 279
  year: 2012
  ident: 2023061313371354500_bty183-B6
  article-title: Computational techniques for human genome resequencing using mated gapped reads
  publication-title: J. Comput. Biol
  doi: 10.1089/cmb.2011.0201
– volume: 25
  start-page: 2865
  year: 2009
  ident: 2023061313371354500_bty183-B39
  article-title: Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btp394
– volume: 27
  start-page: 2987
  year: 2011
  ident: 2023061313371354500_bty183-B18
  article-title: A statistical framework for snp calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btr509
– volume: 19
  start-page: 1124
  year: 2009
  ident: 2023061313371354500_bty183-B23
  article-title: SNP detection for massively parallel whole-genome resequencing
  publication-title: Genome Res
  doi: 10.1101/gr.088013.108
– volume: 29
  start-page: 3143
  year: 2013
  ident: 2023061313371354500_bty183-B27
  article-title: Mate-clever: mendelian-inheritance-aware discovery and genotyping of midsize and long indels
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btt556
– volume: 28
  start-page: i318
  year: 2012
  ident: 2023061313371354500_bty183-B26
  article-title: Long read alignment based on maximal exact match seeds
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bts414
– volume: 517
  start-page: 608
  year: 2014
  ident: 2023061313371354500_bty183-B7
  article-title: Resolving the complexity of the human genome using single-molecule sequencing
  publication-title: Nature
  doi: 10.1038/nature13907
– volume: 21
  start-page: 961
  year: 2011
  ident: 2023061313371354500_bty183-B3
  article-title: Dindel: accurate indel calls from short-read data
  publication-title: Genome Res
  doi: 10.1101/gr.112326.110
– volume: 526
  start-page: 68
  year: 2015
  ident: 2023061313371354500_bty183-B2
  article-title: A global reference for human genetic variation
  publication-title: Nature
  doi: 10.1038/nature15393
– volume: 13
  start-page: 8.
  year: 2012
  ident: 2023061313371354500_bty183-B8
  article-title: An integrative variant analysis suite for whole exome next-generation sequencing data
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-13-8
– volume: 11
  start-page: 473
  year: 2010
  ident: 2023061313371354500_bty183-B24
  article-title: A survey of sequence alignment algorithms for next-generation sequencing
  publication-title: Brief. Bioinf
  doi: 10.1093/bib/bbq015
– volume: 11
  start-page: 1033
  year: 2014
  ident: 2023061313371354500_bty183-B30
  article-title: Accurate de novo and transmitted indel detection in exome-capture data using microassembly
  publication-title: Nat. Methods
  doi: 10.1038/nmeth.3069
– volume: 23
  start-page: 833
  year: 2013
  ident: 2023061313371354500_bty183-B37
  article-title: An integrative variant analysis pipeline for accurate genotype/haplotype inference in population ngs data
  publication-title: Genome Res
  doi: 10.1101/gr.146084.112
– volume: 336
  start-page: 193
  year: 2012
  ident: 2023061313371354500_bty183-B4
  article-title: A fine-scale chimpanzee genetic map from population sequencing
  publication-title: Science
  doi: 10.1126/science.1216872
– volume: 491
  start-page: 56
  year: 2012
  ident: 2023061313371354500_bty183-B1
  article-title: An integrated map of genetic variation from 1,092 human genomes
  publication-title: Nature
  doi: 10.1038/nature11632
– volume: 43
  start-page: 491
  year: 2011
  ident: 2023061313371354500_bty183-B11
  article-title: A framework for variation discovery and genotyping using next-generation dna sequencing data
  publication-title: Nat. Genet
  doi: 10.1038/ng.806
– year: 2012
  ident: 2023061313371354500_bty183-B13
– volume: 13
  start-page: 185
  year: 2012
  ident: 2023061313371354500_bty183-B36
  article-title: Estimation of sequencing error rates in short reads
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-13-185
– volume: 43
  start-page: 7217
  year: 2015
  ident: 2023061313371354500_bty183-B15
  article-title: The missing indels: an estimate of indel variation in a human genome and analysis of factors that impede detection
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkv677
– year: 2013
  ident: 2023061313371354500_bty183-B19
– volume: 30
  start-page: 2843
  year: 2014
  ident: 2023061313371354500_bty183-B20
  article-title: Towards better understanding of artifacts in variant calling from high-coverage samples
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btu356
– volume: 32
  start-page: 246
  year: 2014
  ident: 2023061313371354500_bty183-B41
  article-title: Integrating human sequence data sets provides a resource of benchmark snp and indel genotype calls
  publication-title: Nat. Biotechnol
  doi: 10.1038/nbt.2835
– volume: 10
  start-page: R98.
  year: 2009
  ident: 2023061313371354500_bty183-B32
  article-title: Simultaneous alignment of short reads against multiple genomes
  publication-title: Genome Biol
  doi: 10.1186/gb-2009-10-9-r98
– volume: 8
  start-page: e75619.
  year: 2013
  ident: 2023061313371354500_bty183-B25
  article-title: Variant callers for next-generation sequencing data: a comparison study
  publication-title: PloS One
  doi: 10.1371/journal.pone.0075619
– volume: 20
  start-page: 273
  year: 2010
  ident: 2023061313371354500_bty183-B33
  article-title: A snp discovery method to assess variant allele probability from next-generation resequencing data
  publication-title: Genome Res
  doi: 10.1101/gr.096388.109
– volume: 20
  start-page: 537
  year: 2010
  ident: 2023061313371354500_bty183-B5
  article-title: Accurate detection and genotyping of snps utilizing population sequencing data
  publication-title: Genome Res
  doi: 10.1101/gr.100040.109
– volume: 6
  start-page: 677
  year: 2009
  ident: 2023061313371354500_bty183-B9
  article-title: Breakdancer: an algorithm for high-resolution mapping of genomic structural variation
  publication-title: Nat. Methods
  doi: 10.1038/nmeth.1363
– volume: 41
  start-page: D936
  year: 2013
  ident: 2023061313371354500_bty183-B16
  article-title: dbvar and dgva: public archives for genomic structural variation
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gks1213
– volume: 14
  start-page: 274.
  year: 2013
  ident: 2023061313371354500_bty183-B40
  article-title: Comparing a few snp calling algorithms using low-coverage sequencing data
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-14-274
– volume: 20
  start-page: 1297
  year: 2010
  ident: 2023061313371354500_bty183-B28
  article-title: The genome analysis toolkit: a mapreduce framework for analyzing next-generation dna sequencing data
  publication-title: Genome Res
  doi: 10.1101/gr.107524.110
– volume: 2015
  start-page: 1.
  year: 2015
  ident: 2023061313371354500_bty183-B10
  article-title: A comparison of variant calling pipelines using genome in a bottle as a reference
  publication-title: BioMed Res. Int
  doi: 10.1155/2015/456479
– start-page: 27
  volume-title: Combinatorial Pattern Matching, LNCS
  year: 2011
  ident: 2023061313371354500_bty183-B34
  doi: 10.1007/978-3-642-21458-5_5
– volume: 15
  start-page: S2.
  year: 2014
  ident: 2023061313371354500_bty183-B35
  article-title: Randal: a randomized approach to aligning dna sequences to reference genomes
  publication-title: BMC Genomics
  doi: 10.1186/1471-2164-15-S5-S2
– volume: 29
  start-page: i361
  year: 2013
  ident: 2023061313371354500_bty183-B14
  article-title: Short read alignment with populations of genomes
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btt215
– volume: 35
  start-page: D5
  year: 2007
  ident: 2023061313371354500_bty183-B38
  article-title: Database resources of the national center for biotechnology information
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkl1031
– volume: 536
  start-page: 285
  year: 2016
  ident: 2023061313371354500_bty183-B17
  article-title: Analysis of protein-coding genetic variation in 60,706 humans
  publication-title: Nature
  doi: 10.1038/nature19057
– volume: 30
  start-page: 2813
  year: 2014
  ident: 2023061313371354500_bty183-B29
  article-title: Abra: improved coding indel detection via assembly based re-alignment
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btu376
– volume: 31
  start-page: 3694
  year: 2015
  ident: 2023061313371354500_bty183-B21
  article-title: Fermikit: assembly-based variant calling for illumina resequencing data
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btv440
– volume: 25
  start-page: 2078
  year: 2009
  ident: 2023061313371354500_bty183-B22
  article-title: The sequence alignment/map format and samtools
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btp352
– volume: 15
  start-page: 256
  year: 2014
  ident: 2023061313371354500_bty183-B31
  article-title: A survey of tools for variant analysis of next-generation genome sequencing data
  publication-title: Brief. Bioinf
  doi: 10.1093/bib/bbs086
– volume: 52
  start-page: 552
  year: 2005
  ident: 2023061313371354500_bty183-B12
  article-title: Indexing compressed text
  publication-title: J. ACM (JACM)
  doi: 10.1145/1082036.1082039
SSID ssj0051444
ssj0005056
Score 2.273105
Snippet Abstract Motivation The detection of genomic variants has great significance in genomics, bioinformatics, biomedical research and its applications. However,...
The detection of genomic variants has great significance in genomics, bioinformatics, biomedical research and its applications. However, despite a lot of...
SourceID proquest
pubmed
crossref
oup
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 2918
Title Leveraging known genomic variants to improve detection of variants, especially close-by Indels
URI https://www.ncbi.nlm.nih.gov/pubmed/29590294
https://www.proquest.com/docview/2019812500
Volume 34
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwEA9zIPgifls_RgRfBOfaJF2bRxHnFD9eNuiTJU1SGMxWXCf0v_fStJMpoj6VQnOFu2vvd8nd7xA6JSnRKpQppCVEQYKSajPmRXc5SZikmoS62tB_eOwPx-wu8qMWcptemK9H-Jz2kklek4ga4uJeUpTghvDThUBsHHv0FH3WdLiGGcbeABJgdqStYfYOXdr07_wkcikyLXW7fQOdVfAZbKD1GjXiS2vmTdTS2RZatXMky230fK_BI6t5Q9hskmXYUK--TCR-h1TYVLrgIseTav9AY6WLqv4qw3m6eOAc65kdRT8tsZzmM9BmiW8zBbFzB40H16OrYbcenNCVjLpmvHzKJVEcrhBtQiUUY8oUNAVuIAUkEUwAsBAAr7wkEL70-4qEnicpB4NpCGq7qJ3lmd43Ld2JDg1FTxqAEEjOmM8TShUXXAB6CBzEGqXFsmYVN8MtprE93abxsq5jq2sHXSyWvVpajd8WnIFF_vrsSWO3GD4WcwIiMp3PZzGgHQ6IxnddB-1Zgy5EEm6YbDg7-MebDtEaiAxtzdkRahdvc30MIKVIOmjlJvI6lXt-AK5T6hM
linkProvider Oxford University Press
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Leveraging+known+genomic+variants+to+improve+detection+of+variants%2C+especially+close-by+Indels&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Vo%2C+Nam+S&rft.au=Phan%2C+Vinhthuy&rft.date=2018-09-01&rft.issn=1367-4803&rft.eissn=1367-4811&rft.volume=34&rft.issue=17&rft.spage=2918&rft.epage=2926&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbty183&rft.externalDBID=n%2Fa&rft.externalDocID=10_1093_bioinformatics_bty183
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon