A semi-supervised clustering approach using labeled data

Over recent decades, there has been a growing interest in the use of semi-supervised clustering. Compared to the supervised or unsupervised clustering methods for solving different real-life problems, the review of relevant articles shows that semi-supervised clustering methods are more powerful, an...

Full description

Saved in:
Bibliographic Details
Published inScientia Iranica. Transaction D, Computer science & engineering, electrical engineering Vol. 30; no. 1; pp. 104 - 115
Main Authors Taghizabet, A, Tanha, J Tanha, Amini, A, Mohammadzadeh, J
Format Journal Article
LanguageEnglish
Published Tehran Sharif University of Technology 01.02.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Over recent decades, there has been a growing interest in the use of semi-supervised clustering. Compared to the supervised or unsupervised clustering methods for solving different real-life problems, the review of relevant articles shows that semi-supervised clustering methods are more powerful, and even a small amount of supervised information can significantly improve the results of unsupervised methods. One popular method for incorporating partial supervised information is the use of labeled data. In this study, a semi-supervised clustering algorithm called ConvexClust is proposed. The proposed method improves data clustering using a geometric view borrowed from the Lune concept in the connectivity index and 10% of labeled data. Use of labeled data and formation of a convex hull are the beginning steps toward clustering. Next, labeling of non-labeled data and updating of the convex hull in an iterative process are the next steps. Evaluations of three UCI datasets and sixteen artificial datasets indicate that the proposed method outperforms other semi-supervised and traditional clustering techniques.
DOI:10.24200/sci.2022.58519.5772