Towards better accuracy for missing value estimation of epistatic miniarray profiling data by a novel ensemble approach

Epistatic miniarray profiling (E-MAP) is a powerful tool for analyzing gene functions and their biological relevance. However, E-MAP data suffers from large proportion of missing values, which often results in misleading and biased analysis results. It is urgent to develop effective missing value es...

Full description

Saved in:
Bibliographic Details
Published inGenomics (San Diego, Calif.) Vol. 97; no. 5; pp. 257 - 264
Main Authors Pan, Xiao-Yong, Tian, Ye, Huang, Yan, Shen, Hong-Bin
Format Journal Article
LanguageEnglish
Published Amsterdam Elsevier Inc 01.05.2011
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Epistatic miniarray profiling (E-MAP) is a powerful tool for analyzing gene functions and their biological relevance. However, E-MAP data suffers from large proportion of missing values, which often results in misleading and biased analysis results. It is urgent to develop effective missing value estimation methods for E-MAP. Although several independent algorithms can be applied to achieve this goal, their performance varies significantly on different datasets, indicating different algorithms having their own advantages and disadvantages. In this paper, we propose a novel ensemble approach EMDI based on the high-level diversity to impute missing values that consists of two global and four local base estimators. Experimental results on five E-MAP datasets show that EMDI outperforms all single base algorithms, demonstrating an appropriate combination providing complementarity among different methods. Comparison results between several fusion strategies also demonstrate that the proposed high-level diversity scheme is superior to others. EMDI is freely available at www.csbio.sjtu.edu.cn/bioinf/EMDI/.
Bibliography:http://dx.doi.org/10.1016/j.ygeno.2011.03.001
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Article-2
ObjectType-Feature-1
ISSN:0888-7543
1089-8646
DOI:10.1016/j.ygeno.2011.03.001