Convergence of End-to-End Training in Deep Unsupervised Contrastive Learning

Unsupervised contrastive learning has gained increasing attention in the latest research and has proven to be a powerful method for learning representations from unlabeled data. However, little theoretical analysis was known for this framework. In this paper, we study the optimization of deep unsupe...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Author	Wen, Zixin
Format	Paper Journal Article
Language	English
Published	Ithaca Cornell University Library, arXiv.org 30.05.2021
Subjects	Computer Science - Learning Learning Neural networks Optimization Parameterization Statistics - Machine Learning Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Unsupervised contrastive learning has gained increasing attention in the latest research and has proven to be a powerful method for learning representations from unlabeled data. However, little theoretical analysis was known for this framework. In this paper, we study the optimization of deep unsupervised contrastive learning. We prove that, by applying end-to-end training that simultaneously updates two deep over-parameterized neural networks, one can find an approximate stationary solution for the non-convex contrastive loss. This result is inherently different from the existing over-parameterized analysis in the supervised setting because, in contrast to learning a specific target function, unsupervised contrastive learning tries to encode the unlabeled data distribution into the neural networks, which generally has no optimal solution. Our analysis provides theoretical insights into the practical success of these unsupervised pretraining methods.
ISSN:	2331-8422
DOI:	10.48550/arxiv.2002.06979