Representative Feature Alignment for Adaptive Object Detection
Unsupervised domain adaptation for object detection aims to generalize the object detector trained on the label-rich source domain to the unlabeled target domain. Recently, existing works adopt the instance-level alignment or pixel-level alignment to perform domain transfer, which can effectively av...
Saved in:
Published in | IEEE transactions on circuits and systems for video technology Vol. 33; no. 2; pp. 689 - 700 |
---|---|
Main Authors | , , , , , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.02.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Unsupervised domain adaptation for object detection aims to generalize the object detector trained on the label-rich source domain to the unlabeled target domain. Recently, existing works adopt the instance-level alignment or pixel-level alignment to perform domain transfer, which can effectively avoid the negative transfer due to the diverse background between domains. However, we find that they treat all the regions of an instance feature equally without suppressing background area. They do not segment the specific texture and discriminative regions of objects, which are transferable during adaptation. We call the features that combine the local structure feature and semantic discriminant features as representative features. We propose a novel Representative Feature Alignment (RFA) model to align the features extracted from representative patterns of objects, i.e. representative features, for domain adaptation. Specifically, the representative features are extracted by the Representative Feature Extraction (RFE) submodules. The RFE submodules take the features extracted from different intermediate layers of the detector as input, and filter out the representative features layer-by-layer via integrating class weighting generator, category selection and class activation mapping. Then the representative features from multi-layers are further adaptively aggregated to obtain the final representative features, which are utilized to conduct feature alignment in a class-aware manner. Our representative features are free of untransferable regions and background areas, which leads to better feature alignment. Extensive experimental results show that the proposed model outperforms state-of-the-art methods on a few benchmark datasets. |
---|---|
ISSN: | 1051-8215 1558-2205 |
DOI: | 10.1109/TCSVT.2022.3202094 |