Strategies for processing and quality control of Illumina genotyping arrays

Abstract Illumina genotyping arrays have powered thousands of large-scale genome-wide association studies over the past decade. Yet, because of the tremendous volume and complicated genetic assumptions of Illumina genotyping data, processing and quality control (QC) of these data remain a challenge....

Full description

Saved in:

Bibliographic Details
Published in	Briefings in bioinformatics Vol. 19; no. 5; pp. 765 - 775
Main Authors	Zhao, Shilin, Jing, Wang, Samuels, David C, Sheng, Quanghu, Shyr, Yu, Guo, Yan
Format	Journal Article
Language	English
Published	England Oxford University Press 28.09.2018 Oxford Publishing Limited (England)
Subjects	Algorithms Arrays Cluster Analysis Computational Biology - methods Continental Population Groups - genetics data collection Female Gene Frequency Genome-wide association studies genome-wide association study Genome-Wide Association Study - methods Genome-Wide Association Study - statistics & numerical data Genomes Genotype Genotyping Genotyping Techniques - methods Genotyping Techniques - standards Genotyping Techniques - statistics & numerical data High-Throughput Nucleotide Sequencing - methods High-Throughput Nucleotide Sequencing - standards Humans Male Models, Genetic Nucleotides Oligonucleotide Array Sequence Analysis Polymorphism, Single Nucleotide Quality Control Single-nucleotide polymorphism Software cluster SNP array quality control genotyping array genotyping
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Abstract Illumina genotyping arrays have powered thousands of large-scale genome-wide association studies over the past decade. Yet, because of the tremendous volume and complicated genetic assumptions of Illumina genotyping data, processing and quality control (QC) of these data remain a challenge. Thorough QC ensures the accurate identification of single-nucleotide polymorphisms and is required for the correct interpretation of genetic association results. By processing genotyping data on > 100 000 subjects from >10 major Illumina genotyping arrays, we have accumulated extensive experience in handling some of the most peculiar scenarios related to the processing and QC of Illumina genotyping data. Here, we describe strategies for processing Illumina genotyping data from the raw data to an analysis ready format, and we elaborate on the necessary QC procedures required at each processing step. High-quality Illumina genotyping data sets can be obtained by following our detailed QC strategies.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1467-5463 1477-4054 1477-4054
DOI:	10.1093/bib/bbx012