Entropy-Based Feature Selection for Data Clustering Using k-Means and k-Medoids Algorithms
Clustering method splits a large dataset into smaller subsets, where each subset is called a cluster. Every cluster has the same characteristics and each cluster is different from all other clusters. The most common clustering algorithms are the k-Means clustering algorithm and the k-Medoids cluster...
Saved in:
Published in | 2020 Fifth International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN) pp. 36 - 40 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
26.11.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Clustering method splits a large dataset into smaller subsets, where each subset is called a cluster. Every cluster has the same characteristics and each cluster is different from all other clusters. The most common clustering algorithms are the k-Means clustering algorithm and the k-Medoids clustering algorithm. Clustering of high-dimensional dataset may become difficult. To overcome the problem, dimesion of the dataset is reduced. In the present work, we reduce dimension of a dataset by selecting suitable subset of features using entropy-based method. We calculate entropy using both Euclidean and Manhattan distances. We experiment with three widely used datasets from the Machine Learning Repository of the University of California, Irvine (UCI). From the results of experimentation, we can conclude that our approach produces higher clustering accuracies than those of previous works. |
---|---|
DOI: | 10.1109/ICRCICN50933.2020.9296186 |