Comprehensive analysis to improve the validation rate for single nucleotide variants detected by next-generation sequencing

Next-generation sequencing (NGS) has enabled the high-throughput discovery of germline and somatic mutations. However, NGS-based variant detection is still prone to errors, resulting in inaccurate variant calls. Here, we categorized the variants detected by NGS according to total read depth (TD) and...

Full description

Saved in:
Bibliographic Details
Published inPloS one Vol. 9; no. 1; p. e86664
Main Authors Park, Mi-Hyun, Rhee, Hwanseok, Park, Jung Hoon, Woo, Hae-Mi, Choi, Byung-Ok, Kim, Bo-Young, Chung, Ki Wha, Cho, Yoo-Bok, Kim, Hyung Jin, Jung, Ji-Won, Koo, Soo Kyung
Format Journal Article
LanguageEnglish
Published United States Public Library of Science 29.01.2014
Public Library of Science (PLoS)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Next-generation sequencing (NGS) has enabled the high-throughput discovery of germline and somatic mutations. However, NGS-based variant detection is still prone to errors, resulting in inaccurate variant calls. Here, we categorized the variants detected by NGS according to total read depth (TD) and SNP quality (SNPQ), and performed Sanger sequencing with 348 selected non-synonymous single nucleotide variants (SNVs) for validation. Using the SAMtools and GATK algorithms, the validation rate was positively correlated with SNPQ but showed no correlation with TD. In addition, common variants called by both programs had a higher validation rate than caller-specific variants. We further examined several parameters to improve the validation rate, and found that strand bias (SB) was a key parameter. SB in NGS data showed a strong difference between the variants passing validation and those that failed validation, showing a validation rate of more than 92% (filtering cutoff value: alternate allele forward [AF] ≥ 20 and AF<80 in SAMtools, SB<-10 in GATK). Moreover, the validation rate increased significantly (up to 97-99%) when the variant was filtered together with the suggested values of mapping quality (MQ), SNPQ and SB. This detailed and systematic study provides comprehensive recommendations for improving validation rates, saving time and lowering cost in NGS analyses.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-3
content type line 23
ObjectType-Undefined-2
Competing Interests: The authors have the following interests. Hwanseok Rhee and Jung Hoon Park are employed by Macrogen Inc., a company that markets NGS services. There are no further patents, products in development or marketed products to declare. This does not alter the authors’ adherence to all the PLOS ONE policies on sharing data and materials, as detailed online in the guide for authors.
Conceived and designed the experiments: M-HP HR JHP SKK. Performed the experiments: M-HP H-MW B-YK. Analyzed the data: M-HP HR JHP H-MW B-YK Y-BC HJK J-WJ SKK. Contributed reagents/materials/analysis tools: B-OC KWC. Wrote the paper: M-HP JHP SKK.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0086664