Model-Based Clustering With Dissimilarities: A Bayesian Approach
A Bayesian model-based clustering method is proposed for clustering objects on the basis of dissimilarites. This combines two basic ideas. The first is that the objects have latent positions in a Euclidean space, and that the observed dissimilarities are measurements of the Euclidean distances with...
Saved in:
Published in | Journal of computational and graphical statistics Vol. 16; no. 3; pp. 559 - 585 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Alexandria
Taylor & Francis
01.09.2007
American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America Taylor & Francis Ltd |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | A Bayesian model-based clustering method is proposed for clustering objects on the basis of dissimilarites. This combines two basic ideas. The first is that the objects have latent positions in a Euclidean space, and that the observed dissimilarities are measurements of the Euclidean distances with error. The second idea is that the latent positions are generated from a mixture of multivariate normal distributions, each one corresponding to a cluster. We estimate the resulting model in a Bayesian way using Markov chain Monte Carlo. The method carries out multidimensional scaling and model-based clustering simultaneously, and yields good object configurations and good clustering results with reasonable measures of clustering uncertainties. In the examples we study, the clustering results based on low-dimensional configurations were almost as good as those based on high-dimensional ones. Thus, the method can be used as a tool for dimension reduction when clustering high-dimensional objects, which may be useful especially for visual inspection of clusters.
We also propose a Bayesian criterion for choosing the dimension of the object configuration and the number of clusters simultaneously. This is easy to compute and works reasonably well in simulations and real examples. |
---|---|
ISSN: | 1061-8600 1537-2715 |
DOI: | 10.1198/106186007X236127 |