SNP genotype calling and quality control for multi-batch-based studies
Background In genetic analyses, the term ‘batch effect’ refers to systematic differences caused by batch heterogeneity. Controlling this unintended effect is the most important step in quality control (QC) processes that precede analyses. Currently, batch effects are not appropriately controlled by...
Saved in:
Published in | Genes & genomics Vol. 41; no. 8; pp. 927 - 939 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
Singapore
Springer Singapore
01.08.2019
Springer Nature B.V 한국유전학회 |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Background
In genetic analyses, the term ‘batch effect’ refers to systematic differences caused by batch heterogeneity. Controlling this unintended effect is the most important step in quality control (QC) processes that precede analyses. Currently, batch effects are not appropriately controlled by statistics, and newer approaches are required.
Methods
In this report, we propose a new method to detect the heterogeneity of probe intensities among different batches and a procedure for calling genotypes and QC in the presence of a batch effect. First, we conducted a multivariate analysis of variance (MANOVA) to test the differences in probe intensities among batches. If heterogeneity is detected, subjects should be clustered using a K-medoid algorithm using the averages of the probe intensity measurements for each batch and the genotypes of subjects in different clusters should be called separately.
Results
The proposed method was used to assess genotyping data of 3619 subjects consisting of 1074 patients with Alzheimer’s disease, 296 with mild cognitive impairment (MCI), and 1153 controls. The proposed method improves the accuracy of called genotypes without the need to filter a lot of subjects and SNPs, and therefore is a reasonable approach for controlling batch effects.
Conclusions
We proposed a new strategy that detects batch effects with probe intensity measurement and calls genotypes in the presence of batch effects. The application of the proposed method to real data shows that it produces a balanced approach. Furthermore, the proposed method can be extended to various scenarios with a simple modification. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 https://doi.org/10.1007/s13258-019-00827-5 |
ISSN: | 1976-9571 2092-9293 2092-9293 |
DOI: | 10.1007/s13258-019-00827-5 |