Learning Action Images Using Deep Convolutional Neural Networks For 3D Action Recognition

Recently, 3D action recognition has received more attention of research and industrial communities thanks to the popularity of depth sensors and the efficiency of skeleton estimation algorithms. Accordingly, a large number of methods have been studied by using either handcrafted features with tradit...

Full description

Saved in:

Bibliographic Details
Published in	2019 IEEE Sensors Applications Symposium (SAS) pp. 1 - 6
Main Authors	Huynh-The, Thien, Hua, Cam-Hao, Kim, Dong-Seong
Format	Conference Proceeding
Language	English
Published	IEEE 01.03.2019
Subjects	Color deep convolutional neural networks Feature extraction human action recognition Image coding Image color analysis Image recognition Pose feature to image encoding technique Skeleton Three-dimensional displays
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Recently, 3D action recognition has received more attention of research and industrial communities thanks to the popularity of depth sensors and the efficiency of skeleton estimation algorithms. Accordingly, a large number of methods have been studied by using either handcrafted features with traditional classifiers or recurrent neural networks. However, they cannot learn high-level spatial and temporal features of a whole skeleton sequence exhaustively. In this paper, we proposed a novel encoding technique to transform the pose features of joint-joint distance and joint-joint orientation to color pixels. By concatenating the features of all frames in a sequence, the spatial joint correlations and temporal pose dynamics of action appearance are depicted by a color image. For learning action models, we adopt the strategy of end-to-end fine-tuning a pre-trained deep convolutional neural networks to completely capture multiple high-level features at multi-scale action representation. The proposed method achieves the state-of-the-art performance on NTU RGB+D, the largest and most challenging 3D action recognition dataset, for both the cross-subject and cross-view evaluation protocols.
DOI:	10.1109/SAS.2019.8705977