K-modes Algorithm Based on Rough Set and Information Entropy

The traditional K-modes algorithm is susceptible to interference of redundant attributes, and only adopts the 0-1 matching method to define the distance between attribute values of each two objects, without fully considering the influence of each classify attribute on clustering result. In order to...

Full description

Saved in:
Bibliographic Details
Published inJournal of physics. Conference series Vol. 1754; no. 1; pp. 12239 - 12244
Main Authors Xingyu, Gong, Ke, Cao, Pengtao, Jia, Shangfu, Gong
Format Journal Article
LanguageEnglish
Published Bristol IOP Publishing 01.02.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The traditional K-modes algorithm is susceptible to interference of redundant attributes, and only adopts the 0-1 matching method to define the distance between attribute values of each two objects, without fully considering the influence of each classify attribute on clustering result. In order to overcome these shortcomings, this paper proposes improved K-modes clustering algorithm based on rough set and information entropy. Aiming at a large number of redundant attributes in the clustering data, this paper firstly utilizes attribute reduction algorithm of rough set to eliminate redundant attributes and determine the importance of each attribute, then combines information gain to determine the weight of each attribute and finally makes performance tests of the traditional algorithm and the improved algorithm on five data sets of UCI machine learning library, such as Soybean-Small and Zoo. The experimental results show that the clustering efficiency and accuracy of improved algorithm is higher than that of traditional algorithm, and the performance of improved algorithm is better.
ISSN:1742-6588
1742-6596
DOI:10.1088/1742-6596/1754/1/012239