Ensemble Method for Cluster Number Determination and Algorithm Selection in Unsupervised Learning
Unsupervised learning, and more specifically clustering, suffers from the need for expertise in the field to be of use. Researchers must make careful and informed decisions on which algorithm to use with which set of hyperparameters for a given dataset. Additionally, researchers may need to determin...
Saved in:
Main Author | |
---|---|
Format | Journal Article |
Language | English |
Published |
22.12.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Unsupervised learning, and more specifically clustering, suffers from the
need for expertise in the field to be of use. Researchers must make careful and
informed decisions on which algorithm to use with which set of hyperparameters
for a given dataset. Additionally, researchers may need to determine the number
of clusters in the dataset, which is unfortunately itself an input to most
clustering algorithms. All of this before embarking on their actual subject
matter work. After quantifying the impact of algorithm and hyperparameter
selection, we propose an ensemble clustering framework which can be leveraged
with minimal input. It can be used to determine both the number of clusters in
the dataset and a suitable choice of algorithm to use for a given dataset. A
code library is included in the Conclusion for ease of integration. |
---|---|
DOI: | 10.48550/arxiv.2112.13680 |