Semi-Supervised Learning on Riemannian Manifolds

Issue Title: Theoretical Advances in Data Clustering We consider the general problem of utilizing both labeled and unlabeled data to improve classification accuracy. Under the assumption that the data lie on a submanifold in a high dimensional space, we develop an algorithmic framework to classify a...

Full description

Saved in:
Bibliographic Details
Published inMachine learning Vol. 56; no. 1-3; pp. 209 - 239
Main Authors Belkin, Mikhail, Niyogi, Partha
Format Journal Article
LanguageEnglish
Published Dordrecht Springer Nature B.V 01.07.2004
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Issue Title: Theoretical Advances in Data Clustering We consider the general problem of utilizing both labeled and unlabeled data to improve classification accuracy. Under the assumption that the data lie on a submanifold in a high dimensional space, we develop an algorithmic framework to classify a partially labeled data set in a principled manner. The central idea of our approach is that classification functions are naturally defined only on the submanifold in question rather than the total ambient space. Using the Laplace-Beltrami operator one produces a basis (the Laplacian Eigenmaps) for a Hilbert space of square integrable functions on the submanifold. To recover such a basis, only unlabeled examples are required. Once such a basis is obtained, training can be performed using the labeled data set. Our algorithm models the manifold using the adjacency graph for the data and approximates the Laplace-Beltrami operator by the graph Laplacian. We provide details of the algorithm, its theoretical justification, and several practical applications for image, speech, and text classification.[PUBLICATION ABSTRACT]
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ISSN:0885-6125
1573-0565
DOI:10.1023/B:MACH.0000033120.25363.1e