Significance of important attributes for decision making using C5.0

Data mining has lately been the most explored field. Many researchers have contributed different algorithms to study patterns in large data. Datasets may contain large number of attributes working with all of them can be inefficient, for works like in medical treatments and marketing, hence only imp...

Full description

Saved in:
Bibliographic Details
Published in2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT) pp. 1 - 4
Main Authors Ojha, Uma, Jain, Mahima, Jain, Garima, Tiwari, Ravi Krishna
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.07.2017
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Data mining has lately been the most explored field. Many researchers have contributed different algorithms to study patterns in large data. Datasets may contain large number of attributes working with all of them can be inefficient, for works like in medical treatments and marketing, hence only important attributes should be considered. In this paper, we use C5.0 to find the important attributes and apply one classification and two clustering algorithms to observe the consistency of important attributes. In this work, we explore the usefulness of the attributes by finding the accuracy in dataset of breast cancer. We have recorded accuracies of k-means, EMcluster, Naïve Bayes using only the important attributes (which were seven) as 85.23%, 93.14% and 94.72% respectively which was very close to accuracies recorded using all 32 attributes. Naïve Bayes shows better results by using important attributes when compared with all the attributes.
DOI:10.1109/ICCCNT.2017.8204031