An enhanced approach on handling missing values using bagging k-NN imputation
Researchers in the database community have aroused great interest in handling high dimensional data sets for the past decades. Today's business captures inundate sets of data which includes digital documents, web pages-customer databases, hyper-spectral imagery, social networks, gene arrays, pr...
Saved in:
Published in | 2013 International Conference on Computer Communication and Informatics pp. 1 - 8 |
---|---|
Main Authors | , |
Format | Conference Proceeding Journal Article |
Language | English |
Published |
IEEE
01.01.2013
|
Subjects | |
Online Access | Get full text |
ISBN | 1467329061 9781467329064 |
DOI | 10.1109/ICCCI.2013.6466301 |
Cover
Summary: | Researchers in the database community have aroused great interest in handling high dimensional data sets for the past decades. Today's business captures inundate sets of data which includes digital documents, web pages-customer databases, hyper-spectral imagery, social networks, gene arrays, proteomics data, neurobiological signals, high dimensional dynamical systems, sensor networks, financial transactions and traffic statistics thereby generating massive high dimensional datasets. DNA microarray paves methods in identifying different expression levels of thousands of genes during biological process. The problem with microarrays is to measure gene expression from thousands of genes (features) from only tens of hundreds of samples. Microarray data often contain several missing values that may affect subsequent analysis. In this paper, a novel approach on imputation using k-NN with bagging method is proposed to handle missing value. The experimental result shows that the proposed method outperforms other methods in terms of distance and density of clusters. The proposed approach has enhanced the performance of traditional k-NN impute using bagging method. |
---|---|
Bibliography: | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Conference-1 ObjectType-Feature-3 content type line 23 SourceType-Conference Papers & Proceedings-2 |
ISBN: | 1467329061 9781467329064 |
DOI: | 10.1109/ICCCI.2013.6466301 |