On Measuring the Intrinsic Few-Shot Hardness of Datasets
While advances in pre-training have led to dramatic improvements in few-shot learning of NLP tasks, there is limited understanding of what drives successful few-shot adaptation in datasets. In particular, given a new dataset and a pre-trained model, what properties of the dataset make it \emph{few-s...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
16.11.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | While advances in pre-training have led to dramatic improvements in few-shot
learning of NLP tasks, there is limited understanding of what drives successful
few-shot adaptation in datasets. In particular, given a new dataset and a
pre-trained model, what properties of the dataset make it \emph{few-shot
learnable} and are these properties independent of the specific adaptation
techniques used? We consider an extensive set of recent few-shot learning
methods, and show that their performance across a large number of datasets is
highly correlated, showing that few-shot hardness may be intrinsic to datasets,
for a given pre-trained model. To estimate intrinsic few-shot hardness, we then
propose a simple and lightweight metric called "Spread" that captures the
intuition that few-shot learning is made possible by exploiting feature-space
invariances between training and test samples. Our metric better accounts for
few-shot hardness compared to existing notions of hardness, and is ~8-100x
faster to compute. |
---|---|
DOI: | 10.48550/arxiv.2211.09113 |