Nonparametric Bootstrap Likelihood Estimation to Investigate the Chance Set-Up on Clustering Results
Clustering algorithms are widely used in the knowledge discovery domain, but concerns and questions about the validity of the results must be considered. The datasets commonly used for clustering tasks are often large and scale-free, making conventional statistical techniques inadequate for analyzin...
Saved in:
Published in | IEEE open journal of the Computer Society Vol. 6; pp. 438 - 448 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Clustering algorithms are widely used in the knowledge discovery domain, but concerns and questions about the validity of the results must be considered. The datasets commonly used for clustering tasks are often large and scale-free, making conventional statistical techniques inadequate for analyzing result uncertainty. This issue applies to most outcomes obtained from other knowledge discovery techniques, such as machine learning and statistical learning. Traditional statistical methods assume data follows standard distributions, whereas resampling and bootstrapping methods offer more accurate and reliable alternatives. This article introduces a method that employs bootstrap likelihood estimation to infer the uncertainty of generated clustering structures. We first calculated the clustering error in the original dataset and then utilized the proposed method to estimate its nonparametric bootstrapped likelihood. By comparing these two values, we can establish a nonparametric significance testing framework that directly determines the validity of the result. To evaluate the effectiveness of our method, we conducted experiments using synthetic and real datasets. The results demonstrate that our method can successfully validate clustering results. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 2644-1268 2644-1268 |
DOI: | 10.1109/OJCS.2025.3545261 |