Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition
Long-tailed semi-supervised learning poses a significant challenge in training models with limited labeled data exhibiting a long-tailed label distribution. Current state-of-the-art LTSSL approaches heavily rely on high-quality pseudo-labels for large-scale unlabeled data. However, these methods oft...
Saved in:
Main Authors | , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
08.10.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Long-tailed semi-supervised learning poses a significant challenge in
training models with limited labeled data exhibiting a long-tailed label
distribution. Current state-of-the-art LTSSL approaches heavily rely on
high-quality pseudo-labels for large-scale unlabeled data. However, these
methods often neglect the impact of representations learned by the neural
network and struggle with real-world unlabeled data, which typically follows a
different distribution than labeled data. This paper introduces a novel
probabilistic framework that unifies various recent proposals in long-tail
learning. Our framework derives the class-balanced contrastive loss through
Gaussian kernel density estimation. We introduce a continuous contrastive
learning method, CCL, extending our framework to unlabeled data using reliable
and smoothed pseudo-labels. By progressively estimating the underlying label
distribution and optimizing its alignment with model predictions, we tackle the
diverse distribution of unlabeled data in real-world scenarios. Extensive
experiments across multiple datasets with varying unlabeled data distributions
demonstrate that CCL consistently outperforms prior state-of-the-art methods,
achieving over 4% improvement on the ImageNet-127 dataset. Our source code is
available at https://github.com/zhouzihao11/CCL |
---|---|
DOI: | 10.48550/arxiv.2410.06109 |