Improving K-means clustering with enhanced Firefly Algorithms

In this research, we propose two variants of the Firefly Algorithm (FA), namely inward intensified exploration FA (IIEFA) and compound intensified exploration FA (CIEFA), for undertaking the obstinate problems of initialization sensitivity and local optima traps of the K-means clustering model. To e...

Full description

Saved in:

Bibliographic Details
Published in	Applied soft computing Vol. 84; p. 105763
Main Authors	Xie, Hailun, Zhang, Li, Lim, Chee Peng, Yu, Yonghong, Liu, Chengyu, Liu, Han, Walters, Julie
Format	Journal Article
Language	English
Published	Elsevier B.V 01.11.2019
Subjects	Data clustering Firefly Algorithm K-means clustering Swarm intelligence algorithm Firefly Algorithm Swarm intelligence algorithm K-means clustering Data clustering
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this research, we propose two variants of the Firefly Algorithm (FA), namely inward intensified exploration FA (IIEFA) and compound intensified exploration FA (CIEFA), for undertaking the obstinate problems of initialization sensitivity and local optima traps of the K-means clustering model. To enhance the capability of both exploitation and exploration, matrix-based search parameters and dispersing mechanisms are incorporated into the two proposed FA models. We first replace the attractiveness coefficient with a randomized control matrix in the IIEFA model to release the FA from the constraints of biological law, as the exploitation capability in the neighbourhood is elevated from a one-dimensional to multi-dimensional search mechanism with enhanced diversity in search scopes, scales, and directions. Besides that, we employ a dispersing mechanism in the second CIEFA model to dispatch fireflies with high similarities to new positions out of the close neighbourhood to perform global exploration. This dispersing mechanism ensures sufficient variance between fireflies in comparison to increase search efficiency. The ALL-IDB2 database, a skin lesion data set, and a total of 15 UCI data sets are employed to evaluate efficiency of the proposed FA models on clustering tasks. The minimum Redundancy Maximum Relevance (mRMR)-based feature selection method is also adopted to reduce feature dimensionality. The empirical results indicate that the proposed FA models demonstrate statistically significant superiority in both distance and performance measures for clustering tasks in comparison with conventional K-means clustering, five classical search methods, and five advanced FA variants. •Two FA variants are proposed to overcome local optima traps of KM clustering.•An inward action lifts the diagonal search constraints of FA to avoid stagnation.•A dispersing mechanism expands the swarm exploration to diversify the search.•The proposed clustering models achieve enhanced centroids with fast convergence.•They outperform other classical and hybrid clustering models significantly.
ISSN:	1568-4946 1872-9681
DOI:	10.1016/j.asoc.2019.105763