Significance of important attributes for decision making using C5.0

Data mining has lately been the most explored field. Many researchers have contributed different algorithms to study patterns in large data. Datasets may contain large number of attributes working with all of them can be inefficient, for works like in medical treatments and marketing, hence only imp...

Full description

Saved in:

Bibliographic Details
Published in	2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT) pp. 1 - 4
Main Authors	Ojha, Uma, Jain, Mahima, Jain, Garima, Tiwari, Ravi Krishna
Format	Conference Proceeding
Language	English
Published	IEEE 01.07.2017
Subjects	Algorithm design and analysis Breast cancer C5.0 Classification algorithms Clustering algorithms clustering methods Data mining important attributes
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Data mining has lately been the most explored field. Many researchers have contributed different algorithms to study patterns in large data. Datasets may contain large number of attributes working with all of them can be inefficient, for works like in medical treatments and marketing, hence only important attributes should be considered. In this paper, we use C5.0 to find the important attributes and apply one classification and two clustering algorithms to observe the consistency of important attributes. In this work, we explore the usefulness of the attributes by finding the accuracy in dataset of breast cancer. We have recorded accuracies of k-means, EMcluster, Naïve Bayes using only the important attributes (which were seven) as 85.23%, 93.14% and 94.72% respectively which was very close to accuracies recorded using all 32 attributes. Naïve Bayes shows better results by using important attributes when compared with all the attributes.
DOI:	10.1109/ICCCNT.2017.8204031