A Transformer-based Unsupervised Domain Adaptation Method for Skeleton Behavior Recognition
In recent years, skeleton-based action recognition has received extensive attention, and a large number of researches have achieved excellent performance. In this article, we investigate on unsupervised domain adaptation(UDA) method used in skeleton-based action recognition tasks, which is challengi...
Saved in:
Published in | IEEE access Vol. 11; p. 1 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
01.01.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In recent years, skeleton-based action recognition has received extensive attention, and a large number of researches have achieved excellent performance. In this article, we investigate on unsupervised domain adaptation(UDA) method used in skeleton-based action recognition tasks, which is challenging in real scenes. In domain adaptation tasks, the labels are only available on source domain but unavailable on target domain. Different from other traditional approaches for UDA like the adversarial learning-based methods, we adopt a transformer mechanism based on cross-attention to align different domains. It learns from both source and target domain to reduce the domain shift between different skeleton datasets, thus reducing the effect of pseudo-labels errors which is generated in domain adaptation process. Taking the particularity of skeleton data into account, we explore the feature representation in both spatial and temporal dimensions. We focus on the adjacency dependency of skeleton joints, that is, each node is a weight summary of adjacent joints. It enables the network to pay attention to the global characteristics of skeleton data and consider the local characteristics of joint connections. Sequences are divided into several parts, called subs, to reduce the time cost of the model. We conduct experiments on five datasets for skeleton-based action recognition, including two large-scale datasets (NTU RGB+D, NW-UCLA). Extensive results demonstrate that our method outperforms other approaches in some aspects. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2023.3274658 |