Low-Shot Validation: Active Importance Sampling for Estimating Classifier Performance on Rare Categories

For machine learning models trained with limited labeled training data, validation stands to become the main bottleneck to reducing overall annotation costs. We propose a statistical validation algorithm that accurately estimates the F-score of binary classifiers for rare categories, where finding r...

Full description

Saved in:
Bibliographic Details
Published in2021 IEEE/CVF International Conference on Computer Vision (ICCV) pp. 10685 - 10694
Main Authors Poms, Fait, Sarukkai, Vishnu, Mullapudi, Ravi Teja, Sohoni, Nimit S., Mark, William R., Ramanan, Deva, Fatahalian, Kayvon
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.10.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:For machine learning models trained with limited labeled training data, validation stands to become the main bottleneck to reducing overall annotation costs. We propose a statistical validation algorithm that accurately estimates the F-score of binary classifiers for rare categories, where finding relevant examples to evaluate on is particularly challenging. Our key insight is that simultaneous calibration and importance sampling enables accurate estimates even in the low-sample regime (< 300 samples). Critically, we also derive an accurate single-trial estimator of the variance of our method and demonstrate that this estimator is empirically accurate at low sample counts, enabling a practitioner to know how well they can trust a given low-sample estimate. When validating state-of-the-art semi-supervised models on ImageNet and iNatural-ist2017, our method achieves the same estimates of model performance with up to 10× fewer labels than competing approaches. In particular, we can estimate model F1 scores with a variance of 0.005 using as few as 100 labels.
ISSN:2380-7504
DOI:10.1109/ICCV48922.2021.01053