Determining the Number of Clusters via Iterative Consensus Clustering

We use a cluster ensemble to determine the number of clusters, k, in a group of data. A consensus similarity matrix is formed from the ensemble using multiple algorithms and several values for k. A random walk is induced on the graph defined by the consensus matrix and the eigenvalues of the associa...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Race, Shaina, Meyer, Carl, Valakuzhy, Kevin
Format	Paper Journal Article
Language	English
Published	Ithaca Cornell University Library, arXiv.org 05.08.2014
Subjects	Algorithms Clustering Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning Eigenvalues Random walk Similarity Statistics - Machine Learning Transition probabilities
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We use a cluster ensemble to determine the number of clusters, k, in a group of data. A consensus similarity matrix is formed from the ensemble using multiple algorithms and several values for k. A random walk is induced on the graph defined by the consensus matrix and the eigenvalues of the associated transition probability matrix are used to determine the number of clusters. For noisy or high-dimensional data, an iterative technique is presented to refine this consensus matrix in way that encourages a block-diagonal form. It is shown that the resulting consensus matrix is generally superior to existing similarity matrices for this type of spectral analysis.
ISSN:	2331-8422
DOI:	10.48550/arxiv.1408.0967