IM-c-means: a new clustering algorithm for clusters with skewed distributions
In this paper, a new clustering algorithm, IM-c-means, is proposed for clusters with skewed distributions. C-means algorithm is a well-known and widely used strategy for data clustering, but at the same time prone to poor performance if the data set is not distributed uniformly, which is called “uni...
Saved in:
Published in | Pattern analysis and applications : PAA Vol. 24; no. 2; pp. 611 - 623 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
London
Springer London
01.05.2021
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In this paper, a new clustering algorithm, IM-c-means, is proposed for clusters with skewed distributions. C-means algorithm is a well-known and widely used strategy for data clustering, but at the same time prone to poor performance if the data set is not distributed uniformly, which is called “uniform effect” in studies. We first analyze the cause of this effect and find that it occurs only when clusters sizes are varied, whereas different object densities inter-clusters have no effect on c-means algorithm. According to this finding, we propose to form a new objective function by considering volumes and object densities of all clusters, which creates a new effective clustering algorithm with respect to the clusters with varied sizes or densities, while at the same time inheriting the good performance of traditional c-means algorithm for balanced data set. The experiments using both synthetic and real data sets have provided promising results of the proposed clustering algorithm. In addition, the nonparametric test has showed that the proposed algorithm could offer a significant improvement over other clustering methods for imbalanced data sets. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 1433-7541 1433-755X |
DOI: | 10.1007/s10044-020-00932-2 |