Class-Incremental Learning with CLIP: Adaptive Representation Adjustment and Parameter Fusion
Class-incremental learning is a challenging problem, where the goal is to train a model that can classify data from an increasing number of classes over time. With the advancement of vision-language pre-trained models such as CLIP, they demonstrate good generalization ability that allows them to exc...
Saved in:
Main Authors | , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
19.07.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Class-incremental learning is a challenging problem, where the goal is to
train a model that can classify data from an increasing number of classes over
time. With the advancement of vision-language pre-trained models such as CLIP,
they demonstrate good generalization ability that allows them to excel in
class-incremental learning with completely frozen parameters. However, further
adaptation to downstream tasks by simply fine-tuning the model leads to severe
forgetting. Most existing works with pre-trained models assume that the
forgetting of old classes is uniform when the model acquires new knowledge. In
this paper, we propose a method named Adaptive Representation Adjustment and
Parameter Fusion (RAPF). During training for new data, we measure the influence
of new classes on old ones and adjust the representations, using textual
features. After training, we employ a decomposed parameter fusion to further
mitigate forgetting during adapter module fine-tuning. Experiments on several
conventional benchmarks show that our method achieves state-of-the-art results.
Our code is available at \url{https://github.com/linlany/RAPF}. |
---|---|
DOI: | 10.48550/arxiv.2407.14143 |