Cluster Centers Provide Good First Labels for Object Detection

Learning object detection models with a few labels, is possible due to ingenious few-shot techniques, and due to clever selection of images to be labeled. Few-shot techniques work with as few as 1 to 10 randomized labels per object class. We are curious if performance of randomized label selection c...

Full description

Saved in:

Bibliographic Details
Published in	Image Analysis and Processing – ICIAP 2022 pp. 404 - 413
Main Authors	Burghouts, Gertjan J., Kruithof, Maarten, Huizinga, Wyke, Schutte, Klamer
Format	Book Chapter
Language	English
Published	Cham Springer International Publishing
Series	Lecture Notes in Computer Science
Subjects	Clustering Few labels Label selection Object detection
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Learning object detection models with a few labels, is possible due to ingenious few-shot techniques, and due to clever selection of images to be labeled. Few-shot techniques work with as few as 1 to 10 randomized labels per object class. We are curious if performance of randomized label selection can be improved by selecting 1 to 10 labels per object class in a non-random manner. Several active learning techniques have been proposed to select object labels, but all started with a minimum of several tens of labels. We explore an effective and simple label selection strategy, for the case of 1 to 10 labels per object class. First, the full unlabeled dataset is clustered into N clusters, where N is the desired number of labels. Clustering is based on k-means on embedding vectors from a state-of-the-art pretrained image classification model (SimCLR v2). The image closest to the center is selected to be labeled. It is effective: on Pascal VOC we validate that it improves over randomized selection over 25%, with large improvements especially when having 1 label per object class. We have several benefits to report on this simple strategy: it is easy to implement, it is effective, and it is relevant in practice where one often starts with a dataset without any labels.
ISBN:	3031064267 9783031064265
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-031-06427-2_34