Multi-fuzzy clustering validity index ensemble: A Dempster-Shafer theory-based parallel and series fusion

Clustering validity evaluation is a key part in clustering process. To adapt the complex data structure, the traditional fuzzy clustering validity index (FCVI) is designed more complex. The weighted combined validity evaluation method (WCVEM) is simple in structure but difficult in weight selection....

Full description

Saved in:
Bibliographic Details
Published inEgyptian informatics journal Vol. 24; no. 4; p. 100417
Main Authors Wang, Hong-Yu, Wang, Jie-Sheng, Wang, Guan
Format Journal Article
LanguageEnglish
Published Elsevier 01.12.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Clustering validity evaluation is a key part in clustering process. To adapt the complex data structure, the traditional fuzzy clustering validity index (FCVI) is designed more complex. The weighted combined validity evaluation method (WCVEM) is simple in structure but difficult in weight selection. Therefore, this paper proposed an ensemble method based on multi-fuzzy clustering algorithms and multi-FCVI. Firstly, multi-FCVI are calculated by using the multiple sets of cluster centers and membership degrees that obtained by multi-fuzzy clustering algorithms. This can improve the robustness of the multi-FCVI. Secondly, multi-FCVI are ensembled by Dempster-Shafer (DS) theory. The validity index basic probability assignment function can be obtained by calculating the credibility of each validity index with different clusters number. Finally, the decision module is used to output the optimal clusters number. This paper ensembles multi-fuzzy clustering algorithms, multi-FCVI, and the DS theory by using series and parallel structure to verify performance of the proposed model and the degree of information retention of the FCVI. The proposed method is simple in structure and does not need to be select weighted. 6 artificial datasets and 12 UCI datasets were selected to simulate and verify the method. When facing different data, the simulation results show that the parallel structure has the highest accuracy, and the series structure is even worse than the weighted method in some datasets. In addition, the paper changes the value of fuzzy weighted, and experimental results show that the ensemble method has better stability than other methods in the face of different fuzzy weighted strategy.
ISSN:1110-8665
DOI:10.1016/j.eij.2023.100417