Human Action Recognition Using 3D Reconstruction Data

In this paper, the problem of human action recognition using 3D reconstruction data is deeply investigated. 3D reconstruction techniques are employed for addressing two of the most challenging issues related to human action recognition in the general case, namely view variance (i.e., when the same a...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on circuits and systems for video technology Vol. 28; no. 8; pp. 1807 - 1823
Main Authors	Papadopoulos, Georgios Th, Daras, Petros
Format	Journal Article
Language	English
Published	New York IEEE 01.08.2018 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	3D flow 3D reconstruction 3D shape action recognition Estimation Feature extraction Histograms Human motion Human performance Moving object recognition Portable equipment Reconstruction Representations Robustness Shape Shape recognition Three dimensional flow Three-dimensional displays Videos
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, the problem of human action recognition using 3D reconstruction data is deeply investigated. 3D reconstruction techniques are employed for addressing two of the most challenging issues related to human action recognition in the general case, namely view variance (i.e., when the same action is observed from different viewpoints) and the presence of (self-) occlusions (i.e., when for a given point of view a body part of an individual conceals another body part of the same or another subject). The main contributions of this paper are summarized as follows. The first one is a detailed examination of the use of 3D reconstruction data for performing human action recognition. The latter includes the introduction of appropriate local/global flow/shape descriptors, extensive experiments in challenging publicly available datasets and exhaustive comparisons with state-of-art approaches. The second one is a new local-level 3D flow descriptor, which incorporates spatial and surface information in the flow representation and efficiently handles the problem of defining 3D orientation at every local neighborhood. The third one is a new global-level 3D flow descriptor that efficiently encodes the global motion characteristics in a compact way. The fourth one is a novel global temporal-shape descriptor that extends the notion of 3D shape descriptions for action recognition, by incorporating the temporal dimension. The proposed descriptor efficiently addresses the inherent problems of temporal alignment and compact representation, while also being robust in the presence of noise (compared with similar tracking-based methods of the literature). Overall, this paper significantly improves the state-of-art performance and introduces new research directions in the field of 3D action recognition, following the recent development and wide-spread use of portable, affordable, high-quality and accurate motion capturing devices (e.g., Microsoft Kinect).
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1051-8215 1558-2205
DOI:	10.1109/TCSVT.2016.2643161