A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark
Representation learning promises to unlock deep learning for the long tail of vision tasks without expensive labelled datasets. Yet, the absence of a unified evaluation for general visual representations hinders progress. Popular protocols are often too constrained (linear classification), limited i...
Saved in:
Main Authors | , , , , , , , , , , , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
01.10.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Representation learning promises to unlock deep learning for the long tail of
vision tasks without expensive labelled datasets. Yet, the absence of a unified
evaluation for general visual representations hinders progress. Popular
protocols are often too constrained (linear classification), limited in
diversity (ImageNet, CIFAR, Pascal-VOC), or only weakly related to
representation quality (ELBO, reconstruction error). We present the Visual Task
Adaptation Benchmark (VTAB), which defines good representations as those that
adapt to diverse, unseen tasks with few examples. With VTAB, we conduct a
large-scale study of many popular publicly-available representation learning
algorithms. We carefully control confounders such as architecture and tuning
budget. We address questions like: How effective are ImageNet representations
beyond standard natural datasets? How do representations trained via generative
and discriminative models compare? To what extent can self-supervision replace
labels? And, how close are we to general visual representations? |
---|---|
DOI: | 10.48550/arxiv.1910.04867 |