Semi-Supervised Learning on Riemannian Manifolds

Issue Title: Theoretical Advances in Data Clustering We consider the general problem of utilizing both labeled and unlabeled data to improve classification accuracy. Under the assumption that the data lie on a submanifold in a high dimensional space, we develop an algorithmic framework to classify a...

Full description

Saved in:

Bibliographic Details
Published in	Machine learning Vol. 56; no. 1-3; pp. 209 - 239
Main Authors	Belkin, Mikhail, Niyogi, Partha
Format	Journal Article
Language	English
Published	Dordrecht Springer Nature B.V 01.07.2004
Subjects	Mathematical functions Studies
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Issue Title: Theoretical Advances in Data Clustering We consider the general problem of utilizing both labeled and unlabeled data to improve classification accuracy. Under the assumption that the data lie on a submanifold in a high dimensional space, we develop an algorithmic framework to classify a partially labeled data set in a principled manner. The central idea of our approach is that classification functions are naturally defined only on the submanifold in question rather than the total ambient space. Using the Laplace-Beltrami operator one produces a basis (the Laplacian Eigenmaps) for a Hilbert space of square integrable functions on the submanifold. To recover such a basis, only unlabeled examples are required. Once such a basis is obtained, training can be performed using the labeled data set. Our algorithm models the manifold using the adjacency graph for the data and approximates the Laplace-Beltrami operator by the graph Laplacian. We provide details of the algorithm, its theoretical justification, and several practical applications for image, speech, and text classification.[PUBLICATION ABSTRACT]
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
ISSN:	0885-6125 1573-0565
DOI:	10.1023/B:MACH.0000033120.25363.1e