Skeleton-CutMix: Mixing Up Skeleton with Probabilistic Bone Exchange for Supervised Domain Adaptation

We present Skeleton-CutMix, a simple and effective skeleton augmentation framework for supervised domain adaptation and show its advantage in skeleton-based action recognition tasks. Existing approaches usually perform domain adaptation for action recognition with elaborate loss functions that aim t...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on image processing Vol. 32; p. 1
Main Authors Liu, Hanchao, Liu, Yuhe, Mu, Tai-Jiang, Huang, Xiaolei, Hu, Shi-Min
Format Journal Article
LanguageEnglish
Published United States IEEE 01.01.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We present Skeleton-CutMix, a simple and effective skeleton augmentation framework for supervised domain adaptation and show its advantage in skeleton-based action recognition tasks. Existing approaches usually perform domain adaptation for action recognition with elaborate loss functions that aim to achieve domain alignment. However, they fail to capture the intrinsic characteristics of skeleton representation. Benefiting from the well-defined correspondence between bones of a pair of skeletons, we instead mitigate domain shift by fabricating skeleton data in a mixed domain, which mixes up bones from the source domain and the target domain. The fabricated skeletons in the mixed domain can be used to augment training data and train a more general and robust model for action recognition. Specifically, we hallucinate new skeletons by using pairs of skeletons from the source and target domains; a new skeleton is generated by exchanging some bones from the skeleton in the source domain with corresponding bones from the skeleton in the target domain, which resembles a cut-and-mix operation. When exchanging bones from different domains, we introduce a class-specific bone sampling strategy so that bones that are more important for an action class are exchanged with higher probability when generating augmentation samples for that class. We show experimentally that the simple bone exchange strategy for augmentation is efficient and effective and that distinctive motion features are preserved while mixing both action and style across domains. We validate our method in cross-dataset and cross-age settings on NTU-60 and ETRI-Activity3D datasets with an average gain of over 3% in terms of action recognition accuracy, and demonstrate its superior performance over previous domain adaptation approaches as well as other skeleton augmentation strategies.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1057-7149
1941-0042
DOI:10.1109/TIP.2023.3293766