SNP genotype calling and quality control for multi-batch-based studies

Background In genetic analyses, the term ‘batch effect’ refers to systematic differences caused by batch heterogeneity. Controlling this unintended effect is the most important step in quality control (QC) processes that precede analyses. Currently, batch effects are not appropriately controlled by...

Full description

Saved in:
Bibliographic Details
Published inGenes & genomics Vol. 41; no. 8; pp. 927 - 939
Main Authors Seo, Sujin, Park, Kyungtaek, Lee, Jang Jae, Choi, Kyu Yeong, Lee, Kun Ho, Won, Sungho
Format Journal Article
LanguageEnglish
Published Singapore Springer Singapore 01.08.2019
Springer Nature B.V
한국유전학회
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Background In genetic analyses, the term ‘batch effect’ refers to systematic differences caused by batch heterogeneity. Controlling this unintended effect is the most important step in quality control (QC) processes that precede analyses. Currently, batch effects are not appropriately controlled by statistics, and newer approaches are required. Methods In this report, we propose a new method to detect the heterogeneity of probe intensities among different batches and a procedure for calling genotypes and QC in the presence of a batch effect. First, we conducted a multivariate analysis of variance (MANOVA) to test the differences in probe intensities among batches. If heterogeneity is detected, subjects should be clustered using a K-medoid algorithm using the averages of the probe intensity measurements for each batch and the genotypes of subjects in different clusters should be called separately. Results The proposed method was used to assess genotyping data of 3619 subjects consisting of 1074 patients with Alzheimer’s disease, 296 with mild cognitive impairment (MCI), and 1153 controls. The proposed method improves the accuracy of called genotypes without the need to filter a lot of subjects and SNPs, and therefore is a reasonable approach for controlling batch effects. Conclusions We proposed a new strategy that detects batch effects with probe intensity measurement and calls genotypes in the presence of batch effects. The application of the proposed method to real data shows that it produces a balanced approach. Furthermore, the proposed method can be extended to various scenarios with a simple modification.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
https://doi.org/10.1007/s13258-019-00827-5
ISSN:1976-9571
2092-9293
2092-9293
DOI:10.1007/s13258-019-00827-5