Learning Action Images Using Deep Convolutional Neural Networks For 3D Action Recognition
Recently, 3D action recognition has received more attention of research and industrial communities thanks to the popularity of depth sensors and the efficiency of skeleton estimation algorithms. Accordingly, a large number of methods have been studied by using either handcrafted features with tradit...
Saved in:
Published in | 2019 IEEE Sensors Applications Symposium (SAS) pp. 1 - 6 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.03.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Recently, 3D action recognition has received more attention of research and industrial communities thanks to the popularity of depth sensors and the efficiency of skeleton estimation algorithms. Accordingly, a large number of methods have been studied by using either handcrafted features with traditional classifiers or recurrent neural networks. However, they cannot learn high-level spatial and temporal features of a whole skeleton sequence exhaustively. In this paper, we proposed a novel encoding technique to transform the pose features of joint-joint distance and joint-joint orientation to color pixels. By concatenating the features of all frames in a sequence, the spatial joint correlations and temporal pose dynamics of action appearance are depicted by a color image. For learning action models, we adopt the strategy of end-to-end fine-tuning a pre-trained deep convolutional neural networks to completely capture multiple high-level features at multi-scale action representation. The proposed method achieves the state-of-the-art performance on NTU RGB+D, the largest and most challenging 3D action recognition dataset, for both the cross-subject and cross-view evaluation protocols. |
---|---|
DOI: | 10.1109/SAS.2019.8705977 |