Single Nucleotide Polymorphism (SNP)-Strings: An Alternative Method for Assessing Genetic Associations

Genome-wide association studies (GWAS) identify disease-associations for single-nucleotide-polymorphisms (SNPs) from scattered genomic-locations. However, SNPs frequently reside on several different SNP-haplotypes, only some of which may be disease-associated. This circumstance lowers the observed o...

Full description

Saved in:

Bibliographic Details
Published in	PloS one Vol. 9; no. 4; p. e90034
Main Authors	Goodin, Douglas S., Khankhanian, Pouya
Format	Journal Article
Language	English
Published	United States Public Library of Science 01.04.2014 Public Library of Science (PLoS)
Subjects	Algorithms Alzheimer's disease Analysis Biology and Life Sciences Chromosome Mapping Deoxyribonucleic acid Diagnosis DNA Drb1 protein Endopeptidase Endopeptidases Female Gene loci Gene polymorphism Genetic aspects Genetic Predisposition to Disease - genetics Genetics Genome-Wide Association Study Genomes Genotype Haplotypes Haplotypes - genetics Humans Linkage Disequilibrium Major histocompatibility complex Male Medicine and Health Sciences Methods Multiple sclerosis Multiple Sclerosis - genetics Neurology Neuropeptides Patient outcomes Polymorphism Polymorphism, Single Nucleotide - genetics Population Probabilistic methods Prospective Studies Quality control Single nucleotide polymorphisms Single-nucleotide polymorphism Statistical analysis Strings Studies Valleys β-Amyloid United States United States > US San Francisco California California
Online Access	Get full text
ISSN	1932-6203 1932-6203
DOI	10.1371/journal.pone.0090034

Cover

More Information
Summary:	Genome-wide association studies (GWAS) identify disease-associations for single-nucleotide-polymorphisms (SNPs) from scattered genomic-locations. However, SNPs frequently reside on several different SNP-haplotypes, only some of which may be disease-associated. This circumstance lowers the observed odds-ratio for disease-association. Here we develop a method to identify the two SNP-haplotypes, which combine to produce each person's SNP-genotype over specified chromosomal segments. Two multiple sclerosis (MS)-associated genetic regions were modeled; DRB1 (a Class II molecule of the major histocompatibility complex) and MMEL1 (an endopeptidase that degrades both neuropeptides and β-amyloid). For each locus, we considered sets of eleven adjacent SNPs, surrounding the putative disease-associated gene and spanning ∼200 kb of DNA. The SNP-information was converted into an ordered-set of eleven-numbers (subject-vectors) based on whether a person had zero, one, or two copies of particular SNP-variant at each sequential SNP-location. SNP-strings were defined as those ordered-combinations of eleven-numbers (0 or 1), representing a haplotype, two of which combined to form the observed subject-vector. Subject-vectors were resolved using probabilistic methods. In both regions, only a small number of SNP-strings were present. We compared our method to the SHAPEIT-2 phasing-algorithm. When the SNP-information spanning 200 kb was used, SHAPEIT-2 was inaccurate. When the SHAPEIT-2 window was increased to 2,000 kb, the concordance between the two methods, in both of these eleven-SNP regions, was over 99%, suggesting that, in these regions, both methods were quite accurate. Nevertheless, correspondence was not uniformly high over the entire DNA-span but, rather, was characterized by alternating peaks and valleys of concordance. Moreover, in the valleys of poor-correspondence, SHAPEIT-2 was also inconsistent with itself, suggesting that the SNP-string method is more accurate across the entire region. Accurate haplotype identification will enhance the detection of genetic-associations. The SNP-string method provides a simple means to accomplish this and can be extended to cover larger genomic regions, thereby improving a GWAS's power, even for those published previously.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Conceived and designed the experiments: DSG. Performed the experiments: DSG. Analyzed the data: DSG PK. Wrote the paper: DSG PK. Competing Interests: The authors have declared that no competing interests exist.
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0090034