A novel index measure imputation algorithm for missing data values: A machine learning approach

The problem of missing data in the real world datasets has very significant role in the real time data mining process and becomes more complex in large databases. The presence of missing values influences data set features and the class attributes, thus affecting the predictive accuracies of the cla...

Full description

Saved in:
Bibliographic Details
Published in2012 IEEE International Conference on Computational Intelligence and Computing Research pp. 1 - 7
Main Authors Madhu, G., Rajinikanth, T. V.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2012
Subjects
Online AccessGet full text
ISBN1467313424
9781467313421
DOI10.1109/ICCIC.2012.6510198

Cover

Loading…
More Information
Summary:The problem of missing data in the real world datasets has very significant role in the real time data mining process and becomes more complex in large databases. The presence of missing values influences data set features and the class attributes, thus affecting the predictive accuracies of the classifiers. For the last one decade, many researchers have come out with different techniques for dealing with missing attribute values in databases with homogeneous and/or numeric attributes. In this research work, we proposed a new indexing measure to the imputation algorithm for missing data values of the attributes to compute the similarity measure between any two typical elements in the dataset. It can also be applied on any dataset be it a nominal and/or real. The proposed algorithm is evaluated by extensive experiments and comparison with KNNI, SVMI, WKNNI, KMI and FKMI algorithms. The results showed that the proposed algorithm has better performance than the existing imputation algorithms in terms of classification accuracy and also our decision tree algorithm employs highly accurate decision rules.
ISBN:1467313424
9781467313421
DOI:10.1109/ICCIC.2012.6510198