Significance of important attributes for decision making using C5.0
Data mining has lately been the most explored field. Many researchers have contributed different algorithms to study patterns in large data. Datasets may contain large number of attributes working with all of them can be inefficient, for works like in medical treatments and marketing, hence only imp...
Saved in:
Published in | 2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT) pp. 1 - 4 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.07.2017
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Data mining has lately been the most explored field. Many researchers have contributed different algorithms to study patterns in large data. Datasets may contain large number of attributes working with all of them can be inefficient, for works like in medical treatments and marketing, hence only important attributes should be considered. In this paper, we use C5.0 to find the important attributes and apply one classification and two clustering algorithms to observe the consistency of important attributes. In this work, we explore the usefulness of the attributes by finding the accuracy in dataset of breast cancer. We have recorded accuracies of k-means, EMcluster, Naïve Bayes using only the important attributes (which were seven) as 85.23%, 93.14% and 94.72% respectively which was very close to accuracies recorded using all 32 attributes. Naïve Bayes shows better results by using important attributes when compared with all the attributes. |
---|---|
DOI: | 10.1109/ICCCNT.2017.8204031 |