What Variables Affect Out-of-Distribution Generalization in Pretrained Models?
Embeddings produced by pre-trained deep neural networks (DNNs) are widely used; however, their efficacy for downstream tasks can vary widely. We study the factors influencing transferability and out-of-distribution (OOD) generalization of pre-trained DNN embeddings through the lens of the tunnel eff...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
23.05.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Embeddings produced by pre-trained deep neural networks (DNNs) are widely
used; however, their efficacy for downstream tasks can vary widely. We study
the factors influencing transferability and out-of-distribution (OOD)
generalization of pre-trained DNN embeddings through the lens of the tunnel
effect hypothesis, which is closely related to intermediate neural collapse.
This hypothesis suggests that deeper DNN layers compress representations and
hinder OOD generalization. Contrary to earlier work, our experiments show this
is not a universal phenomenon. We comprehensively investigate the impact of DNN
architecture, training data, image resolution, and augmentations on
transferability. We identify that training with high-resolution datasets
containing many classes greatly reduces representation compression and improves
transferability. Our results emphasize the danger of generalizing findings from
toy datasets to broader contexts. |
---|---|
DOI: | 10.48550/arxiv.2405.15018 |