Clustering with the Average Silhouette Width

The Average Silhouette Width (ASW) is a popular cluster validation index to estimate the number of clusters. The question whether it also is suitable as a general objective function to be optimized for finding a clustering is addressed. Two algorithms (the standard version OSil and a fast version FO...

Full description

Saved in:
Bibliographic Details
Published inComputational statistics & data analysis Vol. 158; p. 107190
Main Authors Batool, Fatima, Hennig, Christian
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.06.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The Average Silhouette Width (ASW) is a popular cluster validation index to estimate the number of clusters. The question whether it also is suitable as a general objective function to be optimized for finding a clustering is addressed. Two algorithms (the standard version OSil and a fast version FOSil) are proposed, and they are compared with existing clustering methods in an extensive simulation study covering known and unknown numbers of clusters. Real data sets are analysed, partly exploring the use of the new methods with non-Euclidean distances. The ASW is shown to satisfy some axioms that have been proposed for cluster quality functions. The new methods prove useful and sensible in many cases, but some weaknesses are also highlighted. These also concern the use of the ASW for estimating the number of clusters together with other methods, which is of general interest due to the popularity of the ASW for this task.
ISSN:0167-9473
1872-7352
DOI:10.1016/j.csda.2021.107190