Deep State-Space Model for Noise Tolerant Skeleton-Based Action Recognition

Action recognition using skeleton data (3D coordinates of human joints) is an attractive topic due to its robustness to the actor's appearance, camera's viewpoint, illumination, and other environmental conditions. However, skeleton data must be measured by a depth sensor or extracted from...

Full description

Saved in:
Bibliographic Details
Published inIEICE Transactions on Information and Systems Vol. E103.D; no. 6; pp. 1217 - 1225
Main Authors KAWAMURA, Kazuki, MATSUBARA, Takashi, UEHARA, Kuniaki
Format Journal Article
LanguageEnglish
Published Tokyo The Institute of Electronics, Information and Communication Engineers 01.06.2020
Japan Science and Technology Agency
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Action recognition using skeleton data (3D coordinates of human joints) is an attractive topic due to its robustness to the actor's appearance, camera's viewpoint, illumination, and other environmental conditions. However, skeleton data must be measured by a depth sensor or extracted from video data using an estimation algorithm, and doing so risks extraction errors and noise. In this work, for robust skeleton-based action recognition, we propose a deep state-space model (DSSM). The DSSM is a deep generative model of the underlying dynamics of an observable sequence. We applied the proposed DSSM to skeleton data, and the results demonstrate that it improves the classification performance of a baseline method. Moreover, we confirm that feature extraction with the proposed DSSM renders subsequent classifications robust to noise and missing values. In such experimental settings, the proposed DSSM outperforms a state-of-the-art method.
ISSN:0916-8532
1745-1361
DOI:10.1587/transinf.2019MVP0012