Interpretable 3D Human Action Analysis with Temporal Convolutional Networks

The discriminative power of modern deep learning models for 3D human action recognition is growing ever so potent. In conjunction with the recent resurgence of 3D human action representation with 3D skeletons, the quality and the pace of recent progress have been significant. However, the inner work...

Full description

Saved in:

Bibliographic Details
Published in	2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) pp. 1623 - 1631
Main Authors	Tae Soo Kim, Reiter, Austin
Format	Conference Proceeding
Language	English
Published	IEEE 01.07.2017
Subjects	Activity recognition Computational modeling Feature extraction Skeleton Solid modeling Three-dimensional displays
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The discriminative power of modern deep learning models for 3D human action recognition is growing ever so potent. In conjunction with the recent resurgence of 3D human action representation with 3D skeletons, the quality and the pace of recent progress have been significant. However, the inner workings of state-of-the-art learning based methods in 3D human action recognition still remain mostly black-box. In this work, we propose to use a new class of models known as Temporal Convolutional Neural Networks (TCN) for 3D human action recognition. TCN provides us a way to explicitly learn readily interpretable spatio-temporal representations for 3D human action recognition. Through this work, we wish to take a step towards a spatio-temporal model that is easier to understand, explain and interpret. The resulting model, Res-TCN, achieves state-of-the-art results on the largest 3D human action recognition dataset, NTU-RGBD.
ISSN:	2160-7516
DOI:	10.1109/CVPRW.2017.207