A semi-supervised clustering approach using labeled data
Over recent decades, there has been a growing interest in the use of semi-supervised clustering. Compared to the supervised or unsupervised clustering methods for solving different real-life problems, the review of relevant articles shows that semi-supervised clustering methods are more powerful, an...
Saved in:
Published in | Scientia Iranica. Transaction D, Computer science & engineering, electrical engineering Vol. 30; no. 1; pp. 104 - 115 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Tehran
Sharif University of Technology
01.02.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Over recent decades, there has been a growing interest in the use of semi-supervised clustering. Compared to the supervised or unsupervised clustering methods for solving different real-life problems, the review of relevant articles shows that semi-supervised clustering methods are more powerful, and even a small amount of supervised information can significantly improve the results of unsupervised methods. One popular method for incorporating partial supervised information is the use of labeled data. In this study, a semi-supervised clustering algorithm called ConvexClust is proposed. The proposed method improves data clustering using a geometric view borrowed from the Lune concept in the connectivity index and 10% of labeled data. Use of labeled data and formation of a convex hull are the beginning steps toward clustering. Next, labeling of non-labeled data and updating of the convex hull in an iterative process are the next steps. Evaluations of three UCI datasets and sixteen artificial datasets indicate that the proposed method outperforms other semi-supervised and traditional clustering techniques. |
---|---|
DOI: | 10.24200/sci.2022.58519.5772 |