Clustering the imbalanced datasets using modified Kohonen self-organizing map (KSOM)

The distribution of data plays an important role in determining the successfulness of learning process in machine learning. Data sets with imbalanced distribution may lead to biased results, especially in clustering. If the data is insufficient, the clustering will not be able to cluster and this wi...

Full description

Saved in:
Bibliographic Details
Published in2017 Computing Conference : 18-20 July 2017 pp. 751 - 755
Main Authors Ahmad, Azlin, Ismail, Mohd Najib, Yusoff, Rubiyah, Rosli, Nenny Ruthfalydia
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.07.2017
Subjects
Online AccessGet full text
DOI10.1109/SAI.2017.8252180

Cover

More Information
Summary:The distribution of data plays an important role in determining the successfulness of learning process in machine learning. Data sets with imbalanced distribution may lead to biased results, especially in clustering. If the data is insufficient, the clustering will not be able to cluster and this will add randomness to the grouping. Therefore, the KSOM algorithm is modified to improve the clustering process. This modification is done based on the exploration and exploitation procedures in Ant Clustering Algorithm (ACA). To investigate the effectiveness of the modified algorithm, three imbalanced data sets are chosen; glass, Wisconsin diagnostic breast cancer and tropical wood data set. From the result, the modified KSOM has able to produce accurate number of clusters, reduce the number of overlapped cluster and slightly improve the percentage of accuracy.
DOI:10.1109/SAI.2017.8252180