An efficient self-attention network for skeleton-based action recognition

There has been significant progress in skeleton-based action recognition. Human skeleton can be naturally structured into graph, so graph convolution networks have become the most popular method in this task. Most of these state-of-the-art methods optimized the structure of human skeleton graph to o...

Full description

Saved in:

Bibliographic Details
Published in	Scientific reports Vol. 12; no. 1; pp. 4111 - 10
Main Authors	Qin, Xiaofei, Cai, Rui, Yu, Jiabin, He, Changxiang, Zhang, Xuedian
Format	Journal Article
Language	English
Published	London Nature Publishing Group UK 08.03.2022 Nature Publishing Group Nature Portfolio
Subjects	639/705/117 639/705/258 Accuracy Algorithms Artificial intelligence Convolution Design Humanities and Social Sciences Humans Laboratories Machine learning Methods multidisciplinary Neural networks Neural Networks, Computer Recognition, Psychology Researchers Science Science (multidisciplinary) Skeleton
Online Access	Get full text

Cover

Loading…

More Information
Summary:	There has been significant progress in skeleton-based action recognition. Human skeleton can be naturally structured into graph, so graph convolution networks have become the most popular method in this task. Most of these state-of-the-art methods optimized the structure of human skeleton graph to obtain better performance. Based on these advanced algorithms, a simple but strong network is proposed with three major contributions. Firstly, inspired by some adaptive graph convolution networks and non-local blocks, some kinds of self-attention modules are designed to exploit spatial and temporal dependencies and dynamically optimize the graph structure. Secondly, a light but efficient architecture of network is designed for skeleton-based action recognition. Moreover, a trick is proposed to enrich the skeleton data with bones connection information and make obvious improvement to the performance. The method achieves 90.5% accuracy on cross-subjects setting (NTU60), with 0.89M parameters and 0.32 GMACs of computation cost. This work is expected to inspire new ideas for the field.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2045-2322 2045-2322
DOI:	10.1038/s41598-022-08157-5