An improved preprocessing algorithm for haplotype inference by pure parsimony

The identification of haplotypes, which encode SNPs in a single chromosome, makes it possible to perform a haplotype-based association test with disease. Given a set of genotypes from a population, the process of recovering the haplotypes, which explain the genotypes, is called haplotype inference (...

Full description

Saved in:
Bibliographic Details
Published inJournal of bioinformatics and computational biology Vol. 12; no. 4; p. 1450020
Main Authors Choi, Mun-Ho, Kang, Seung-Ho, Lim, Hyeong-Seok
Format Journal Article
LanguageEnglish
Published Singapore 01.08.2014
Subjects
Online AccessGet more information

Cover

Loading…
More Information
Summary:The identification of haplotypes, which encode SNPs in a single chromosome, makes it possible to perform a haplotype-based association test with disease. Given a set of genotypes from a population, the process of recovering the haplotypes, which explain the genotypes, is called haplotype inference (HI). We propose an improved preprocessing method for solving the haplotype inference by pure parsimony (HIPP), which excludes a large amount of redundant haplotypes by detecting some groups of haplotypes that are dispensable for optimal solutions. The method uses only inclusion relations between groups of haplotypes but dramatically reduces the number of candidate haplotypes; therefore, it causes the computational time and memory reduction of real HIPP solvers. The proposed method can be easily coupled with a wide range of optimization methods which consider a set of candidate haplotypes explicitly. For the simulated and well-known benchmark datasets, the experimental results show that our method coupled with a classical exact HIPP solver run much faster than the state-of-the-art solver and can solve a large number of instances that were so far unaffordable in a reasonable time.
ISSN:1757-6334
DOI:10.1142/S0219720014500206