Dissecting Continual Learning a Structural and Data Analysis
Continual Learning (CL) is a field dedicated to devise algorithms able to achieve lifelong learning. Overcoming the knowledge disruption of previously acquired concepts, a drawback affecting deep learning models and that goes by the name of catastrophic forgetting, is a hard challenge. Currently, de...
Saved in:
Main Author | |
---|---|
Format | Journal Article |
Language | English |
Published |
03.01.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Continual Learning (CL) is a field dedicated to devise algorithms able to
achieve lifelong learning. Overcoming the knowledge disruption of previously
acquired concepts, a drawback affecting deep learning models and that goes by
the name of catastrophic forgetting, is a hard challenge. Currently, deep
learning methods can attain impressive results when the data modeled does not
undergo a considerable distributional shift in subsequent learning sessions,
but whenever we expose such systems to this incremental setting, performance
drop very quickly. Overcoming this limitation is fundamental as it would allow
us to build truly intelligent systems showing stability and plasticity.
Secondly, it would allow us to overcome the onerous limitation of retraining
these architectures from scratch with the new updated data. In this thesis, we
tackle the problem from multiple directions. In a first study, we show that in
rehearsal-based techniques (systems that use memory buffer), the quantity of
data stored in the rehearsal buffer is a more important factor over the quality
of the data. Secondly, we propose one of the early works of incremental
learning on ViTs architectures, comparing functional, weight and attention
regularization approaches and propose effective novel a novel asymmetric loss.
At the end we conclude with a study on pretraining and how it affects the
performance in Continual Learning, raising some questions about the effective
progression of the field. We then conclude with some future directions and
closing remarks. |
---|---|
DOI: | 10.48550/arxiv.2301.01033 |