Soft Set Based Clustering and Its Comparison on Categorical Data
Categorical data clustering is problematic since it is difficult or complex to determine how comparable the data is. Several methods, most recently centroid-based strategies, have been developed to reduce the complexity of the similarity of categorical data. These methods nevertheless result in leng...
Saved in:
Published in | 2023 IEEE 9th Information Technology International Seminar (ITIS) pp. 1 - 5 |
---|---|
Main Authors | , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
18.10.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Categorical data clustering is problematic since it is difficult or complex to determine how comparable the data is. Several methods, most recently centroid-based strategies, have been developed to reduce the complexity of the similarity of categorical data. These methods nevertheless result in lengthy processing durations. Another method, soft set-based clustering (SSC), based on the probability function of multivariate multinomial distributions, is suggested in this article. Soft sets are used to represent the data, and each soft set has a probability for each object. The joint cluster distribution function determines the probability for each object after the multivariate multinomial distribution function. The connected cluster would receive the highest likelihood. Benchmark data sets from UCI machine learning are used to compare the performance of the approach to the baseline techniques. The outcomes demonstrate that the suggested strategy performed better in purity, rank index, and calculation time. |
---|---|
DOI: | 10.1109/ITIS59651.2023.10419962 |