Convolutional Neural Networks and Long Short-Term Memory for skeleton-based human activity and hand gesture recognition

•Combination of a Convolutional Neural Network (CNN) and a Long Short-Term Memory (LSTM) recurrent network for skeleton-based human activity and hand gesture recognition.•Two-stage training strategy which firstly focuses on the CNN training and, secondly, adjusts the full method CNN+LSTM.•A method f...

Full description

Saved in:
Bibliographic Details
Published inPattern recognition Vol. 76; pp. 80 - 94
Main Authors Núñez, Juan C., Cabido, Raúl, Pantrigo, Juan J., Montemayor, Antonio S., Vélez, José F.
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.04.2018
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•Combination of a Convolutional Neural Network (CNN) and a Long Short-Term Memory (LSTM) recurrent network for skeleton-based human activity and hand gesture recognition.•Two-stage training strategy which firstly focuses on the CNN training and, secondly, adjusts the full method CNN+LSTM.•A method for data augmentation in the context of spatiotemporal 3D data sequences.•An exhaustive experimental study on publicly available data benchmarks with respect to the state-of-the-art most representative methods.•Comparison among different CPU and GPU platforms. In this work, we address human activity and hand gesture recognition problems using 3D data sequences obtained from full-body and hand skeletons, respectively. To this aim, we propose a deep learning-based approach for temporal 3D pose recognition problems based on a combination of a Convolutional Neural Network (CNN) and a Long Short-Term Memory (LSTM) recurrent network. We also present a two-stage training strategy which firstly focuses on CNN training and, secondly, adjusts the full method (CNN+LSTM). Experimental testing demonstrated that our training method obtains better results than a single-stage training strategy. Additionally, we propose a data augmentation method that has also been validated experimentally. Finally, we perform an extensive experimental study on publicly available data benchmarks. The results obtained show how the proposed approach reaches state-of-the-art performance when compared to the methods identified in the literature. The best results were obtained for small datasets, where the proposed data augmentation strategy has greater impact.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2017.10.033