Few-shot object detection with semantic enhancement and semantic prototype contrastive learning

Few-shot object detection (FSOD), which aims to teach machines to detect objects belonging to novel classes via extremely few annotated data, has attracted extensive research interest. However, the performance of FSOD is still limited by the lack of data. Visual information of novel objects has sign...

Full description

Saved in:

Bibliographic Details
Published in	Knowledge-based systems Vol. 252; p. 109411
Main Authors	Huang, Lian, Dai, Shaosheng, He, Ziqiang
Format	Journal Article
Language	English
Published	Elsevier B.V 27.09.2022
Subjects	Cross-attention Few-shot learning Object detection Supervised contrastive learning Word embedding Word embedding Supervised contrastive learning Few-shot learning Cross-attention Object detection
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Few-shot object detection (FSOD), which aims to teach machines to detect objects belonging to novel classes via extremely few annotated data, has attracted extensive research interest. However, the performance of FSOD is still limited by the lack of data. Visual information of novel objects has significant intraclass variance under the few-shot setting, so single visual information cannot accurately represent the objects themselves. In contrast, humans are good at combining visual and semantic systems to recognize new concepts simultaneously. In this paper, we fully explore utilizing additional semantic knowledge to assist the FSOD task. Concretely, we first obtain the semantic representation of classes by the word embedding model learned from a large corpus of text. We then design a semantic enhancement (SE) module to enhance the incomprehensively visual representation of novel classes. To further improve the classification performance, we define a semantic prototype contrastive (SPC) loss to learn a more discriminative embedding space, where features to be detected belonging to the same class are compactly clustered around the corresponding semantic representation. Furthermore, we also introduce the semantic margin between different semantic representations for SPC loss to adaptively separate the margin between features belonging to different classes. Extensive experiments on the PASCAL VOC and MS-COCO benchmarks demonstrate that the proposed method achieves state-of-the-art performance. •Semantic knowledge strengthens the expression of visual information.•Semantic prototype contrastive enables the classifier to learn a more discriminative embedding space.•Semantic margin facilitates the separation of features belonging to similar classes.•The proposed method achieves state-of-the-art few-shot object detection performance.
ISSN:	0950-7051 1872-7409
DOI:	10.1016/j.knosys.2022.109411