Fast detection of high-order epistatic interactions in genome-wide association studies using information theoretic measure

•We propose a fast method for detection of high order epistatic interaction in GWAS.•We employ mutual information as association measure and for SNP clustering.•Our algorithm makes high order GWAS in a matter of hours on a PC.•We report up to 5-way interactions on each of seven diseases in WTCCC dat...

Full description

Saved in:
Bibliographic Details
Published inComputational biology and chemistry Vol. 50; pp. 19 - 28
Main Authors Leem, Sangseob, Jeong, Hyun-hwan, Lee, Jungseob, Wee, Kyubum, Sohn, Kyung-Ah
Format Journal Article
LanguageEnglish
Published England Elsevier Ltd 01.06.2014
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•We propose a fast method for detection of high order epistatic interaction in GWAS.•We employ mutual information as association measure and for SNP clustering.•Our algorithm makes high order GWAS in a matter of hours on a PC.•We report up to 5-way interactions on each of seven diseases in WTCCC data. There are many algorithms for detecting epistatic interactions in GWAS. However, most of these algorithms are applicable only for detecting two-locus interactions. Some algorithms are designed to detect only two-locus interactions from the beginning. Others do not have limits to the order of interactions, but in practice take very long time to detect higher order interactions in real data of GWAS. Even the better ones take days to detect higher order interactions in WTCCC data. We propose a fast algorithm for detection of high order epistatic interactions in GWAS. It runs k-means clustering algorithm on the set of all SNPs. Then candidates are selected from each cluster. These candidates are examined to find the causative SNPs of k-locus interactions. We use mutual information from information theory as the measure of association between genotypes and phenotypes. We tested the power and speed of our method on extensive sets of simulated data. The results show that our method has more or equal power, and runs much faster than previously reported methods. We also applied our algorithm on each of seven diseases in WTCCC data to analyze up to 5-locus interactions. It takes only a few hours to analyze 5-locus interactions in one dataset. From the results we make some interesting and meaningful observations on each disease in WTCCC data. In this study, a simple yet powerful two-step approach is proposed for fast detection of high order epistatic interaction. Our algorithm makes it possible to detect high order epistatic interactions in GWAS in a matter of hours on a PC.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1476-9271
1476-928X
DOI:10.1016/j.compbiolchem.2014.01.005