Representative Feature Alignment for Adaptive Object Detection

Unsupervised domain adaptation for object detection aims to generalize the object detector trained on the label-rich source domain to the unlabeled target domain. Recently, existing works adopt the instance-level alignment or pixel-level alignment to perform domain transfer, which can effectively av...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on circuits and systems for video technology Vol. 33; no. 2; pp. 689 - 700
Main Authors Xu, Shan, Zhang, Huaidong, Xu, Xuemiao, Hu, Xiaowei, Xu, Yangyang, Dai, Liangui, Choi, Kup-Sze, Heng, Pheng-Ann
Format Journal Article
LanguageEnglish
Published New York IEEE 01.02.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Unsupervised domain adaptation for object detection aims to generalize the object detector trained on the label-rich source domain to the unlabeled target domain. Recently, existing works adopt the instance-level alignment or pixel-level alignment to perform domain transfer, which can effectively avoid the negative transfer due to the diverse background between domains. However, we find that they treat all the regions of an instance feature equally without suppressing background area. They do not segment the specific texture and discriminative regions of objects, which are transferable during adaptation. We call the features that combine the local structure feature and semantic discriminant features as representative features. We propose a novel Representative Feature Alignment (RFA) model to align the features extracted from representative patterns of objects, i.e. representative features, for domain adaptation. Specifically, the representative features are extracted by the Representative Feature Extraction (RFE) submodules. The RFE submodules take the features extracted from different intermediate layers of the detector as input, and filter out the representative features layer-by-layer via integrating class weighting generator, category selection and class activation mapping. Then the representative features from multi-layers are further adaptively aggregated to obtain the final representative features, which are utilized to conduct feature alignment in a class-aware manner. Our representative features are free of untransferable regions and background areas, which leads to better feature alignment. Extensive experimental results show that the proposed model outperforms state-of-the-art methods on a few benchmark datasets.
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2022.3202094