NCL++: Nested Collaborative Learning for Long-Tailed Visual Recognition
Long-tailed visual recognition has received increasing attention in recent years. Due to the extremely imbalanced data distribution in long-tailed learning, the learning process shows great uncertainties. For example, the predictions of different experts on the same image vary remarkably despite the...
Saved in:
Main Authors | , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
29.06.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Long-tailed visual recognition has received increasing attention in recent
years. Due to the extremely imbalanced data distribution in long-tailed
learning, the learning process shows great uncertainties. For example, the
predictions of different experts on the same image vary remarkably despite the
same training settings. To alleviate the uncertainty, we propose a Nested
Collaborative Learning (NCL++) which tackles the long-tailed learning problem
by a collaborative learning. To be specific, the collaborative learning
consists of two folds, namely inter-expert collaborative learning (InterCL) and
intra-expert collaborative learning (IntraCL). In-terCL learns multiple experts
collaboratively and concurrently, aiming to transfer the knowledge among
different experts. IntraCL is similar to InterCL, but it aims to conduct the
collaborative learning on multiple augmented copies of the same image within
the single expert. To achieve the collaborative learning in long-tailed
learning, the balanced online distillation is proposed to force the consistent
predictions among different experts and augmented copies, which reduces the
learning uncertainties. Moreover, in order to improve the meticulous
distinguishing ability on the confusing categories, we further propose a Hard
Category Mining (HCM), which selects the negative categories with high
predicted scores as the hard categories. Then, the collaborative learning is
formulated in a nested way, in which the learning is conducted on not just all
categories from a full perspective but some hard categories from a partial
perspective. Extensive experiments manifest the superiority of our method with
outperforming the state-of-the-art whether with using a single model or an
ensemble. The code will be publicly released. |
---|---|
DOI: | 10.48550/arxiv.2306.16709 |