A Transformer-based Unsupervised Domain Adaptation Method for Skeleton Behavior Recognition

In recent years, skeleton-based action recognition has received extensive attention, and a large number of researches have achieved excellent performance. In this article, we investigate on unsupervised domain adaptation(UDA) method used in skeleton-based action recognition tasks, which is challengi...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 11; p. 1
Main Authors Yan, QiuYan, Hu, Yan
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.01.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In recent years, skeleton-based action recognition has received extensive attention, and a large number of researches have achieved excellent performance. In this article, we investigate on unsupervised domain adaptation(UDA) method used in skeleton-based action recognition tasks, which is challenging in real scenes. In domain adaptation tasks, the labels are only available on source domain but unavailable on target domain. Different from other traditional approaches for UDA like the adversarial learning-based methods, we adopt a transformer mechanism based on cross-attention to align different domains. It learns from both source and target domain to reduce the domain shift between different skeleton datasets, thus reducing the effect of pseudo-labels errors which is generated in domain adaptation process. Taking the particularity of skeleton data into account, we explore the feature representation in both spatial and temporal dimensions. We focus on the adjacency dependency of skeleton joints, that is, each node is a weight summary of adjacent joints. It enables the network to pay attention to the global characteristics of skeleton data and consider the local characteristics of joint connections. Sequences are divided into several parts, called subs, to reduce the time cost of the model. We conduct experiments on five datasets for skeleton-based action recognition, including two large-scale datasets (NTU RGB+D, NW-UCLA). Extensive results demonstrate that our method outperforms other approaches in some aspects.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2023.3274658