Parametric UMAP Embeddings for Representation and Semisupervised Learning

UMAP is a nonparametric graph-based dimensionality reduction algorithm using applied Riemannian geometry and algebraic topology to find low-dimensional embeddings of structured data. The UMAP algorithm consists of two steps: (1) computing a graphical representation of a data set (fuzzy simplicial co...

Full description

Saved in:

Bibliographic Details
Published in	Neural computation Vol. 33; no. 11; pp. 2881 - 2907
Main Authors	Sainburg, Tim, McInnes, Leland, Gentner, Timothy Q.
Format	Journal Article
Language	English
Published	One Rogers Street, Cambridge, MA 02142-1209, USA MIT Press 12.10.2021 MIT Press Journals, The
Subjects	Algorithms Embedding Fuzzy sets Graphical representations Machine learning Neural networks Regularization Semi-supervised learning Structured data Topology
Online Access	Get full text

Cover

Loading…

More Information
Summary:	UMAP is a nonparametric graph-based dimensionality reduction algorithm using applied Riemannian geometry and algebraic topology to find low-dimensional embeddings of structured data. The UMAP algorithm consists of two steps: (1) computing a graphical representation of a data set (fuzzy simplicial complex) and (2) through stochastic gradient descent, optimizing a low-dimensional embedding of the graph. Here, we extend the second step of UMAP to a parametric optimization over neural network weights, learning a parametric relationship between data and embedding. We first demonstrate that parametric UMAP performs comparably to its nonparametric counterpart while conferring the benefit of a learned parametric mapping (e.g., fast online embeddings for new data). We then explore UMAP as a regularization, constraining the latent distribution of autoencoders, parametrically varying global structure preservation, and improving classifier accuracy for semisupervised learning by capturing structure in unlabeled data.
Bibliography:	November, 2021 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	0899-7667 1530-888X 1530-888X
DOI:	10.1162/neco_a_01434