Leveraging known genomic variants to improve detection of variants, especially close-by Indels
Abstract Motivation The detection of genomic variants has great significance in genomics, bioinformatics, biomedical research and its applications. However, despite a lot of effort, Indels and structural variants are still under-characterized compared to SNPs. Current approaches based on next-genera...
Saved in:
Published in | Bioinformatics Vol. 34; no. 17; pp. 2918 - 2926 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
England
Oxford University Press
01.09.2018
|
Online Access | Get full text |
ISSN | 1367-4803 1367-4811 1460-2059 1367-4811 |
DOI | 10.1093/bioinformatics/bty183 |
Cover
Abstract | Abstract
Motivation
The detection of genomic variants has great significance in genomics, bioinformatics, biomedical research and its applications. However, despite a lot of effort, Indels and structural variants are still under-characterized compared to SNPs. Current approaches based on next-generation sequencing data usually require large numbers of reads (high coverage) to be able to detect such types of variants accurately. However Indels, especially those close to each other, are still hard to detect accurately.
Results
We introduce a novel approach that leverages known variant information, e.g. provided by dbSNP, dbVar, ExAC or the 1000 Genomes Project, to improve sensitivity of detecting variants, especially close-by Indels. In our approach, the standard reference genome and the known variants are combined to build a meta-reference, which is expected to be probabilistically closer to the subject genomes than the standard reference. An alignment algorithm, which can take into account known variant information, is developed to accurately align reads to the meta-reference. This strategy resulted in accurate alignment and variant calling even with low coverage data. We showed that compared to popular methods such as GATK and SAMtools, our method significantly improves the sensitivity of detecting variants, especially Indels that are close to each other. In particular, our method was able to call these close-by Indels at a 15-20% higher sensitivity than other methods at low coverage, and still get 1-5% higher sensitivity at high coverage, at competitive precision. These results were validated using simulated data with variant profiles extracted from the 1000 Genomes Project data, and real data from the Illumina Platinum Genomes Project and ExAC database. Our finding suggests that by incorporating known variant information in an appropriate manner, sensitive variant calling is possible at a low cost.
Availability and implementation
Implementation can be found in our public code repository https://github.com/namsyvo/IVC.
Supplementary information
Supplementary data are available at Bioinformatics online. |
---|---|
AbstractList | Abstract
Motivation
The detection of genomic variants has great significance in genomics, bioinformatics, biomedical research and its applications. However, despite a lot of effort, Indels and structural variants are still under-characterized compared to SNPs. Current approaches based on next-generation sequencing data usually require large numbers of reads (high coverage) to be able to detect such types of variants accurately. However Indels, especially those close to each other, are still hard to detect accurately.
Results
We introduce a novel approach that leverages known variant information, e.g. provided by dbSNP, dbVar, ExAC or the 1000 Genomes Project, to improve sensitivity of detecting variants, especially close-by Indels. In our approach, the standard reference genome and the known variants are combined to build a meta-reference, which is expected to be probabilistically closer to the subject genomes than the standard reference. An alignment algorithm, which can take into account known variant information, is developed to accurately align reads to the meta-reference. This strategy resulted in accurate alignment and variant calling even with low coverage data. We showed that compared to popular methods such as GATK and SAMtools, our method significantly improves the sensitivity of detecting variants, especially Indels that are close to each other. In particular, our method was able to call these close-by Indels at a 15-20% higher sensitivity than other methods at low coverage, and still get 1-5% higher sensitivity at high coverage, at competitive precision. These results were validated using simulated data with variant profiles extracted from the 1000 Genomes Project data, and real data from the Illumina Platinum Genomes Project and ExAC database. Our finding suggests that by incorporating known variant information in an appropriate manner, sensitive variant calling is possible at a low cost.
Availability and implementation
Implementation can be found in our public code repository https://github.com/namsyvo/IVC.
Supplementary information
Supplementary data are available at Bioinformatics online. The detection of genomic variants has great significance in genomics, bioinformatics, biomedical research and its applications. However, despite a lot of effort, Indels and structural variants are still under-characterized compared to SNPs. Current approaches based on next-generation sequencing data usually require large numbers of reads (high coverage) to be able to detect such types of variants accurately. However Indels, especially those close to each other, are still hard to detect accurately. We introduce a novel approach that leverages known variant information, e.g. provided by dbSNP, dbVar, ExAC or the 1000 Genomes Project, to improve sensitivity of detecting variants, especially close-by Indels. In our approach, the standard reference genome and the known variants are combined to build a meta-reference, which is expected to be probabilistically closer to the subject genomes than the standard reference. An alignment algorithm, which can take into account known variant information, is developed to accurately align reads to the meta-reference. This strategy resulted in accurate alignment and variant calling even with low coverage data. We showed that compared to popular methods such as GATK and SAMtools, our method significantly improves the sensitivity of detecting variants, especially Indels that are close to each other. In particular, our method was able to call these close-by Indels at a 15-20% higher sensitivity than other methods at low coverage, and still get 1-5% higher sensitivity at high coverage, at competitive precision. These results were validated using simulated data with variant profiles extracted from the 1000 Genomes Project data, and real data from the Illumina Platinum Genomes Project and ExAC database. Our finding suggests that by incorporating known variant information in an appropriate manner, sensitive variant calling is possible at a low cost. Implementation can be found in our public code repository https://github.com/namsyvo/IVC. Supplementary data are available at Bioinformatics online. The detection of genomic variants has great significance in genomics, bioinformatics, biomedical research and its applications. However, despite a lot of effort, Indels and structural variants are still under-characterized compared to SNPs. Current approaches based on next-generation sequencing data usually require large numbers of reads (high coverage) to be able to detect such types of variants accurately. However Indels, especially those close to each other, are still hard to detect accurately.MotivationThe detection of genomic variants has great significance in genomics, bioinformatics, biomedical research and its applications. However, despite a lot of effort, Indels and structural variants are still under-characterized compared to SNPs. Current approaches based on next-generation sequencing data usually require large numbers of reads (high coverage) to be able to detect such types of variants accurately. However Indels, especially those close to each other, are still hard to detect accurately.We introduce a novel approach that leverages known variant information, e.g. provided by dbSNP, dbVar, ExAC or the 1000 Genomes Project, to improve sensitivity of detecting variants, especially close-by Indels. In our approach, the standard reference genome and the known variants are combined to build a meta-reference, which is expected to be probabilistically closer to the subject genomes than the standard reference. An alignment algorithm, which can take into account known variant information, is developed to accurately align reads to the meta-reference. This strategy resulted in accurate alignment and variant calling even with low coverage data. We showed that compared to popular methods such as GATK and SAMtools, our method significantly improves the sensitivity of detecting variants, especially Indels that are close to each other. In particular, our method was able to call these close-by Indels at a 15-20% higher sensitivity than other methods at low coverage, and still get 1-5% higher sensitivity at high coverage, at competitive precision. These results were validated using simulated data with variant profiles extracted from the 1000 Genomes Project data, and real data from the Illumina Platinum Genomes Project and ExAC database. Our finding suggests that by incorporating known variant information in an appropriate manner, sensitive variant calling is possible at a low cost.ResultsWe introduce a novel approach that leverages known variant information, e.g. provided by dbSNP, dbVar, ExAC or the 1000 Genomes Project, to improve sensitivity of detecting variants, especially close-by Indels. In our approach, the standard reference genome and the known variants are combined to build a meta-reference, which is expected to be probabilistically closer to the subject genomes than the standard reference. An alignment algorithm, which can take into account known variant information, is developed to accurately align reads to the meta-reference. This strategy resulted in accurate alignment and variant calling even with low coverage data. We showed that compared to popular methods such as GATK and SAMtools, our method significantly improves the sensitivity of detecting variants, especially Indels that are close to each other. In particular, our method was able to call these close-by Indels at a 15-20% higher sensitivity than other methods at low coverage, and still get 1-5% higher sensitivity at high coverage, at competitive precision. These results were validated using simulated data with variant profiles extracted from the 1000 Genomes Project data, and real data from the Illumina Platinum Genomes Project and ExAC database. Our finding suggests that by incorporating known variant information in an appropriate manner, sensitive variant calling is possible at a low cost.Implementation can be found in our public code repository https://github.com/namsyvo/IVC.Availability and implementationImplementation can be found in our public code repository https://github.com/namsyvo/IVC.Supplementary data are available at Bioinformatics online.Supplementary informationSupplementary data are available at Bioinformatics online. |
Author | Vo, Nam S Phan, Vinhthuy |
Author_xml | – sequence: 1 givenname: Nam S surname: Vo fullname: Vo, Nam S email: vosynam@gmail.com organization: Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA – sequence: 2 givenname: Vinhthuy surname: Phan fullname: Phan, Vinhthuy organization: Department of Computer Science, The University of Memphis, Memphis, TN, USA |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/29590294$$D View this record in MEDLINE/PubMed |
BookMark | eNqNkc1uEzEUhS1URH_gEYq8ZMHQ67_MWF2hqrSVIrGBLZbHvhO5zNip7aTK2zNR2kplA6vjxffZvveckqOYIhJyzuALAy0u-pBCHFKebA2uXPR1xzrxhpwwuYCGg9JH81ks2kZ2II7JaSn3AIpJKd-RY66VBq7lCfm1xC1muwpxRX_H9BjpCmOagqNbm4ONtdCaaJjWOW2ReqzoakiRpuEF-EyxrNEFO4476sZUsOl39C56HMt78nawY8EPT3lGfn67_nF12yy_39xdfV02TgqoDRODdtzrObnWnbdeSq-h61poneVKSdsysACa9a1VTi087xhzQjPuUYA4I58O987_fNhgqWYKxeE42ohpUwwHpjvGFezRj0_opp_Qm3UOk80787yTGbg8AC6nUjIOxoVq91PXbMNoGJh9A-Z1A-bQwGyrv-znB_7lwcFLm_V_Kn8A12uj2A |
CitedBy_id | crossref_primary_10_1590_1678_4685_gmb_2020_0047 |
Cites_doi | 10.1089/cmb.2011.0201 10.1093/bioinformatics/btp394 10.1093/bioinformatics/btr509 10.1101/gr.088013.108 10.1093/bioinformatics/btt556 10.1093/bioinformatics/bts414 10.1038/nature13907 10.1101/gr.112326.110 10.1038/nature15393 10.1186/1471-2105-13-8 10.1093/bib/bbq015 10.1038/nmeth.3069 10.1101/gr.146084.112 10.1126/science.1216872 10.1038/nature11632 10.1038/ng.806 10.1186/1471-2105-13-185 10.1093/nar/gkv677 10.1093/bioinformatics/btu356 10.1038/nbt.2835 10.1186/gb-2009-10-9-r98 10.1371/journal.pone.0075619 10.1101/gr.096388.109 10.1101/gr.100040.109 10.1038/nmeth.1363 10.1093/nar/gks1213 10.1186/1471-2105-14-274 10.1101/gr.107524.110 10.1155/2015/456479 10.1007/978-3-642-21458-5_5 10.1186/1471-2164-15-S5-S2 10.1093/bioinformatics/btt215 10.1093/nar/gkl1031 10.1038/nature19057 10.1093/bioinformatics/btu376 10.1093/bioinformatics/btv440 10.1093/bioinformatics/btp352 10.1093/bib/bbs086 10.1145/1082036.1082039 |
ContentType | Journal Article |
Copyright | The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2018 |
Copyright_xml | – notice: The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2018 |
DBID | AAYXX CITATION NPM 7X8 |
DOI | 10.1093/bioinformatics/bty183 |
DatabaseName | CrossRef PubMed MEDLINE - Academic |
DatabaseTitle | CrossRef PubMed MEDLINE - Academic |
DatabaseTitleList | PubMed MEDLINE - Academic |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Biology |
EISSN | 1460-2059 1367-4811 |
EndPage | 2926 |
ExternalDocumentID | 29590294 10_1093_bioinformatics_bty183 10.1093/bioinformatics/bty183 |
Genre | Research Support, U.S. Gov't, Non-P.H.S Journal Article |
GrantInformation_xml | – fundername: National Science Foundation Computing and Communication Foundations grantid: NSF CCF-1320297 |
GroupedDBID | -~X .2P 5GY AAMVS ABJNI ABPTD ACGFS ADZXQ ALMA_UNASSIGNED_HOLDINGS F5P HW0 Q5Y RD5 ROZ TLC TN5 TOX WH7 --- -E4 .DC .I3 0R~ 23N 2WC 4.4 48X 53G 5WA 70D AAIJN AAIMJ AAJKP AAJQQ AAKPC AAMDB AAOGV AAPQZ AAPXW AAUQX AAVAP AAVLN AAYXX ABEJV ABEUO ABGNP ABIXL ABNKS ABPQP ABQLI ABWST ABXVV ABZBJ ACIWK ACPRK ACUFI ACUXJ ACYTK ADBBV ADEYI ADEZT ADFTL ADGKP ADGZP ADHKW ADHZD ADMLS ADOCK ADPDF ADRDM ADRTK ADVEK ADYVW ADZTZ AECKG AEGPL AEJOX AEKKA AEKSI AELWJ AEMDU AENEX AENZO AEPUE AETBJ AEWNT AFFZL AFGWE AFIYH AFOFC AFRAH AGINJ AGKEF AGQXC AGSYK AHMBA AHXPO AIJHB AJEEA AJEUX AKHUL AKWXX ALTZX ALUQC AMNDL APIBT APWMN ARIXL ASPBG AVWKF AXUDD AYOIW AZVOD BAWUL BAYMD BHONS BQDIO BQUQU BSWAC BTQHN C45 CDBKE CITATION CS3 CZ4 DAKXR DIK DILTD DU5 D~K EBD EBS EE~ EJD EMOBN F9B FEDTE FHSFR FLIZI FLUFQ FOEOM FQBLK GAUVT GJXCC GROUPED_DOAJ GX1 H13 H5~ HAR HZ~ IOX J21 JXSIZ KAQDR KOP KQ8 KSI KSN M-Z MK~ ML0 N9A NGC NLBLG NMDNZ NOMLY NU- O9- OAWHX ODMLO OJQWA OK1 OVD OVEED P2P PAFKI PEELM PQQKQ Q1. R44 RNS ROL RPM RUSNO RW1 RXO SV3 TEORI TJP TR2 W8F WOQ X7H YAYTL YKOAZ YXANX ZKX ~91 ~KM ADRIX AFXEN BCRHZ M49 NPM ROX 7X8 |
ID | FETCH-LOGICAL-c430t-13f9c2d913f2998dad44d9088707ca2554a710a0091b7a5c56d2811c3912de303 |
IEDL.DBID | TOX |
ISSN | 1367-4803 1367-4811 |
IngestDate | Fri Jul 11 08:32:33 EDT 2025 Wed Feb 19 02:32:21 EST 2025 Thu Apr 24 23:12:27 EDT 2025 Tue Jul 01 03:27:25 EDT 2025 Wed Apr 02 07:03:16 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 17 |
Language | English |
License | This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c430t-13f9c2d913f2998dad44d9088707ca2554a710a0091b7a5c56d2811c3912de303 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
OpenAccessLink | https://academic.oup.com/bioinformatics/article-pdf/34/17/2918/25702995/bty183.pdf |
PMID | 29590294 |
PQID | 2019812500 |
PQPubID | 23479 |
PageCount | 9 |
ParticipantIDs | proquest_miscellaneous_2019812500 pubmed_primary_29590294 crossref_citationtrail_10_1093_bioinformatics_bty183 crossref_primary_10_1093_bioinformatics_bty183 oup_primary_10_1093_bioinformatics_bty183 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 20180901 2018-09-01 |
PublicationDateYYYYMMDD | 2018-09-01 |
PublicationDate_xml | – month: 09 year: 2018 text: 20180901 day: 01 |
PublicationDecade | 2010 |
PublicationPlace | England |
PublicationPlace_xml | – name: England |
PublicationTitle | Bioinformatics |
PublicationTitleAlternate | Bioinformatics |
PublicationYear | 2018 |
Publisher | Oxford University Press |
Publisher_xml | – name: Oxford University Press |
References | 1000 Genomes Project Consortium (2023061313371354500_bty183-B2) 2015; 526 Garrison (2023061313371354500_bty183-B13) 2012 Chaisson (2023061313371354500_bty183-B7) 2014; 517 Challis (2023061313371354500_bty183-B8) 2012; 13 Lek (2023061313371354500_bty183-B17) 2016; 536 Huang (2023061313371354500_bty183-B14) 2013; 29 Li (2023061313371354500_bty183-B21) 2015; 31 Shen (2023061313371354500_bty183-B33) 2010; 20 Thachuk (2023061313371354500_bty183-B34) 2011 Li (2023061313371354500_bty183-B24) 2010; 11 Mose (2023061313371354500_bty183-B29) 2014; 30 Schneeberger (2023061313371354500_bty183-B32) 2009; 10 Pabinger (2023061313371354500_bty183-B31) 2014; 15 Li (2023061313371354500_bty183-B22) 2009; 25 Lappalainen (2023061313371354500_bty183-B16) 2013; 41 Li (2023061313371354500_bty183-B19) 2013 Ye (2023061313371354500_bty183-B39) 2009; 25 Yu (2023061313371354500_bty183-B40) 2013; 14 Vo (2023061313371354500_bty183-B35) 2014; 15 Carnevali (2023061313371354500_bty183-B6) 2012; 19 1000 Genomes Project Consortium (2023061313371354500_bty183-B1) 2012; 491 Wang (2023061313371354500_bty183-B37) 2013; 23 DePristo (2023061313371354500_bty183-B11) 2011; 43 Li (2023061313371354500_bty183-B20) 2014; 30 Zook (2023061313371354500_bty183-B41) 2014; 32 Bansal (2023061313371354500_bty183-B5) 2010; 20 Chen (2023061313371354500_bty183-B9) 2009; 6 Li (2023061313371354500_bty183-B23) 2009; 19 Liu (2023061313371354500_bty183-B26) 2012; 28 Ferragina (2023061313371354500_bty183-B12) 2005; 52 Li (2023061313371354500_bty183-B18) 2011; 27 Auton (2023061313371354500_bty183-B4) 2012; 336 McKenna (2023061313371354500_bty183-B28) 2010; 20 Liu (2023061313371354500_bty183-B25) 2013; 8 Albers (2023061313371354500_bty183-B3) 2011; 21 Jiang (2023061313371354500_bty183-B15) 2015; 43 Wang (2023061313371354500_bty183-B36) 2012; 13 Wheeler (2023061313371354500_bty183-B38) 2007; 35 Narzisi (2023061313371354500_bty183-B30) 2014; 11 Marschall (2023061313371354500_bty183-B27) 2013; 29 Cornish (2023061313371354500_bty183-B10) 2015; 2015 |
References_xml | – volume: 19 start-page: 279 year: 2012 ident: 2023061313371354500_bty183-B6 article-title: Computational techniques for human genome resequencing using mated gapped reads publication-title: J. Comput. Biol doi: 10.1089/cmb.2011.0201 – volume: 25 start-page: 2865 year: 2009 ident: 2023061313371354500_bty183-B39 article-title: Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp394 – volume: 27 start-page: 2987 year: 2011 ident: 2023061313371354500_bty183-B18 article-title: A statistical framework for snp calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data publication-title: Bioinformatics doi: 10.1093/bioinformatics/btr509 – volume: 19 start-page: 1124 year: 2009 ident: 2023061313371354500_bty183-B23 article-title: SNP detection for massively parallel whole-genome resequencing publication-title: Genome Res doi: 10.1101/gr.088013.108 – volume: 29 start-page: 3143 year: 2013 ident: 2023061313371354500_bty183-B27 article-title: Mate-clever: mendelian-inheritance-aware discovery and genotyping of midsize and long indels publication-title: Bioinformatics doi: 10.1093/bioinformatics/btt556 – volume: 28 start-page: i318 year: 2012 ident: 2023061313371354500_bty183-B26 article-title: Long read alignment based on maximal exact match seeds publication-title: Bioinformatics doi: 10.1093/bioinformatics/bts414 – volume: 517 start-page: 608 year: 2014 ident: 2023061313371354500_bty183-B7 article-title: Resolving the complexity of the human genome using single-molecule sequencing publication-title: Nature doi: 10.1038/nature13907 – volume: 21 start-page: 961 year: 2011 ident: 2023061313371354500_bty183-B3 article-title: Dindel: accurate indel calls from short-read data publication-title: Genome Res doi: 10.1101/gr.112326.110 – volume: 526 start-page: 68 year: 2015 ident: 2023061313371354500_bty183-B2 article-title: A global reference for human genetic variation publication-title: Nature doi: 10.1038/nature15393 – volume: 13 start-page: 8. year: 2012 ident: 2023061313371354500_bty183-B8 article-title: An integrative variant analysis suite for whole exome next-generation sequencing data publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-13-8 – volume: 11 start-page: 473 year: 2010 ident: 2023061313371354500_bty183-B24 article-title: A survey of sequence alignment algorithms for next-generation sequencing publication-title: Brief. Bioinf doi: 10.1093/bib/bbq015 – volume: 11 start-page: 1033 year: 2014 ident: 2023061313371354500_bty183-B30 article-title: Accurate de novo and transmitted indel detection in exome-capture data using microassembly publication-title: Nat. Methods doi: 10.1038/nmeth.3069 – volume: 23 start-page: 833 year: 2013 ident: 2023061313371354500_bty183-B37 article-title: An integrative variant analysis pipeline for accurate genotype/haplotype inference in population ngs data publication-title: Genome Res doi: 10.1101/gr.146084.112 – volume: 336 start-page: 193 year: 2012 ident: 2023061313371354500_bty183-B4 article-title: A fine-scale chimpanzee genetic map from population sequencing publication-title: Science doi: 10.1126/science.1216872 – volume: 491 start-page: 56 year: 2012 ident: 2023061313371354500_bty183-B1 article-title: An integrated map of genetic variation from 1,092 human genomes publication-title: Nature doi: 10.1038/nature11632 – volume: 43 start-page: 491 year: 2011 ident: 2023061313371354500_bty183-B11 article-title: A framework for variation discovery and genotyping using next-generation dna sequencing data publication-title: Nat. Genet doi: 10.1038/ng.806 – year: 2012 ident: 2023061313371354500_bty183-B13 – volume: 13 start-page: 185 year: 2012 ident: 2023061313371354500_bty183-B36 article-title: Estimation of sequencing error rates in short reads publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-13-185 – volume: 43 start-page: 7217 year: 2015 ident: 2023061313371354500_bty183-B15 article-title: The missing indels: an estimate of indel variation in a human genome and analysis of factors that impede detection publication-title: Nucleic Acids Res doi: 10.1093/nar/gkv677 – year: 2013 ident: 2023061313371354500_bty183-B19 – volume: 30 start-page: 2843 year: 2014 ident: 2023061313371354500_bty183-B20 article-title: Towards better understanding of artifacts in variant calling from high-coverage samples publication-title: Bioinformatics doi: 10.1093/bioinformatics/btu356 – volume: 32 start-page: 246 year: 2014 ident: 2023061313371354500_bty183-B41 article-title: Integrating human sequence data sets provides a resource of benchmark snp and indel genotype calls publication-title: Nat. Biotechnol doi: 10.1038/nbt.2835 – volume: 10 start-page: R98. year: 2009 ident: 2023061313371354500_bty183-B32 article-title: Simultaneous alignment of short reads against multiple genomes publication-title: Genome Biol doi: 10.1186/gb-2009-10-9-r98 – volume: 8 start-page: e75619. year: 2013 ident: 2023061313371354500_bty183-B25 article-title: Variant callers for next-generation sequencing data: a comparison study publication-title: PloS One doi: 10.1371/journal.pone.0075619 – volume: 20 start-page: 273 year: 2010 ident: 2023061313371354500_bty183-B33 article-title: A snp discovery method to assess variant allele probability from next-generation resequencing data publication-title: Genome Res doi: 10.1101/gr.096388.109 – volume: 20 start-page: 537 year: 2010 ident: 2023061313371354500_bty183-B5 article-title: Accurate detection and genotyping of snps utilizing population sequencing data publication-title: Genome Res doi: 10.1101/gr.100040.109 – volume: 6 start-page: 677 year: 2009 ident: 2023061313371354500_bty183-B9 article-title: Breakdancer: an algorithm for high-resolution mapping of genomic structural variation publication-title: Nat. Methods doi: 10.1038/nmeth.1363 – volume: 41 start-page: D936 year: 2013 ident: 2023061313371354500_bty183-B16 article-title: dbvar and dgva: public archives for genomic structural variation publication-title: Nucleic Acids Res doi: 10.1093/nar/gks1213 – volume: 14 start-page: 274. year: 2013 ident: 2023061313371354500_bty183-B40 article-title: Comparing a few snp calling algorithms using low-coverage sequencing data publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-14-274 – volume: 20 start-page: 1297 year: 2010 ident: 2023061313371354500_bty183-B28 article-title: The genome analysis toolkit: a mapreduce framework for analyzing next-generation dna sequencing data publication-title: Genome Res doi: 10.1101/gr.107524.110 – volume: 2015 start-page: 1. year: 2015 ident: 2023061313371354500_bty183-B10 article-title: A comparison of variant calling pipelines using genome in a bottle as a reference publication-title: BioMed Res. Int doi: 10.1155/2015/456479 – start-page: 27 volume-title: Combinatorial Pattern Matching, LNCS year: 2011 ident: 2023061313371354500_bty183-B34 doi: 10.1007/978-3-642-21458-5_5 – volume: 15 start-page: S2. year: 2014 ident: 2023061313371354500_bty183-B35 article-title: Randal: a randomized approach to aligning dna sequences to reference genomes publication-title: BMC Genomics doi: 10.1186/1471-2164-15-S5-S2 – volume: 29 start-page: i361 year: 2013 ident: 2023061313371354500_bty183-B14 article-title: Short read alignment with populations of genomes publication-title: Bioinformatics doi: 10.1093/bioinformatics/btt215 – volume: 35 start-page: D5 year: 2007 ident: 2023061313371354500_bty183-B38 article-title: Database resources of the national center for biotechnology information publication-title: Nucleic Acids Res doi: 10.1093/nar/gkl1031 – volume: 536 start-page: 285 year: 2016 ident: 2023061313371354500_bty183-B17 article-title: Analysis of protein-coding genetic variation in 60,706 humans publication-title: Nature doi: 10.1038/nature19057 – volume: 30 start-page: 2813 year: 2014 ident: 2023061313371354500_bty183-B29 article-title: Abra: improved coding indel detection via assembly based re-alignment publication-title: Bioinformatics doi: 10.1093/bioinformatics/btu376 – volume: 31 start-page: 3694 year: 2015 ident: 2023061313371354500_bty183-B21 article-title: Fermikit: assembly-based variant calling for illumina resequencing data publication-title: Bioinformatics doi: 10.1093/bioinformatics/btv440 – volume: 25 start-page: 2078 year: 2009 ident: 2023061313371354500_bty183-B22 article-title: The sequence alignment/map format and samtools publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp352 – volume: 15 start-page: 256 year: 2014 ident: 2023061313371354500_bty183-B31 article-title: A survey of tools for variant analysis of next-generation genome sequencing data publication-title: Brief. Bioinf doi: 10.1093/bib/bbs086 – volume: 52 start-page: 552 year: 2005 ident: 2023061313371354500_bty183-B12 article-title: Indexing compressed text publication-title: J. ACM (JACM) doi: 10.1145/1082036.1082039 |
SSID | ssj0051444 ssj0005056 |
Score | 2.273105 |
Snippet | Abstract
Motivation
The detection of genomic variants has great significance in genomics, bioinformatics, biomedical research and its applications. However,... The detection of genomic variants has great significance in genomics, bioinformatics, biomedical research and its applications. However, despite a lot of... |
SourceID | proquest pubmed crossref oup |
SourceType | Aggregation Database Index Database Enrichment Source Publisher |
StartPage | 2918 |
Title | Leveraging known genomic variants to improve detection of variants, especially close-by Indels |
URI | https://www.ncbi.nlm.nih.gov/pubmed/29590294 https://www.proquest.com/docview/2019812500 |
Volume | 34 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwEA9zIPgifls_RgRfBOfaJF2bRxHnFD9eNuiTJU1SGMxWXCf0v_fStJMpoj6VQnOFu2vvd8nd7xA6JSnRKpQppCVEQYKSajPmRXc5SZikmoS62tB_eOwPx-wu8qMWcptemK9H-Jz2kklek4ga4uJeUpTghvDThUBsHHv0FH3WdLiGGcbeABJgdqStYfYOXdr07_wkcikyLXW7fQOdVfAZbKD1GjXiS2vmTdTS2RZatXMky230fK_BI6t5Q9hskmXYUK--TCR-h1TYVLrgIseTav9AY6WLqv4qw3m6eOAc65kdRT8tsZzmM9BmiW8zBbFzB40H16OrYbcenNCVjLpmvHzKJVEcrhBtQiUUY8oUNAVuIAUkEUwAsBAAr7wkEL70-4qEnicpB4NpCGq7qJ3lmd43Ld2JDg1FTxqAEEjOmM8TShUXXAB6CBzEGqXFsmYVN8MtprE93abxsq5jq2sHXSyWvVpajd8WnIFF_vrsSWO3GD4WcwIiMp3PZzGgHQ6IxnddB-1Zgy5EEm6YbDg7-MebDtEaiAxtzdkRahdvc30MIKVIOmjlJvI6lXt-AK5T6hM |
linkProvider | Oxford University Press |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Leveraging+known+genomic+variants+to+improve+detection+of+variants%2C+especially+close-by+Indels&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Vo%2C+Nam+S&rft.au=Phan%2C+Vinhthuy&rft.date=2018-09-01&rft.issn=1367-4803&rft.eissn=1367-4811&rft.volume=34&rft.issue=17&rft.spage=2918&rft.epage=2926&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbty183&rft.externalDBID=n%2Fa&rft.externalDocID=10_1093_bioinformatics_bty183 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon |