Convolutional Neural Networks and Long Short-Term Memory for skeleton-based human activity and hand gesture recognition

•Combination of a Convolutional Neural Network (CNN) and a Long Short-Term Memory (LSTM) recurrent network for skeleton-based human activity and hand gesture recognition.•Two-stage training strategy which firstly focuses on the CNN training and, secondly, adjusts the full method CNN+LSTM.•A method f...

Full description

Saved in:

Bibliographic Details
Published in	Pattern recognition Vol. 76; pp. 80 - 94
Main Authors	Núñez, Juan C., Cabido, Raúl, Pantrigo, Juan J., Montemayor, Antonio S., Vélez, José F.
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.04.2018
Subjects	Convolutional Neural Network Deep learning Hand gesture recognition Human activity recognition Long Short-Term Memory Real-time Recurrent neural network Deep learning Convolutional Neural Network Recurrent neural network Real-time Long Short-Term Memory Human activity recognition Hand gesture recognition
Online Access	Get full text

Cover

Loading…

More Information
Summary:	•Combination of a Convolutional Neural Network (CNN) and a Long Short-Term Memory (LSTM) recurrent network for skeleton-based human activity and hand gesture recognition.•Two-stage training strategy which firstly focuses on the CNN training and, secondly, adjusts the full method CNN+LSTM.•A method for data augmentation in the context of spatiotemporal 3D data sequences.•An exhaustive experimental study on publicly available data benchmarks with respect to the state-of-the-art most representative methods.•Comparison among different CPU and GPU platforms. In this work, we address human activity and hand gesture recognition problems using 3D data sequences obtained from full-body and hand skeletons, respectively. To this aim, we propose a deep learning-based approach for temporal 3D pose recognition problems based on a combination of a Convolutional Neural Network (CNN) and a Long Short-Term Memory (LSTM) recurrent network. We also present a two-stage training strategy which firstly focuses on CNN training and, secondly, adjusts the full method (CNN+LSTM). Experimental testing demonstrated that our training method obtains better results than a single-stage training strategy. Additionally, we propose a data augmentation method that has also been validated experimentally. Finally, we perform an extensive experimental study on publicly available data benchmarks. The results obtained show how the proposed approach reaches state-of-the-art performance when compared to the methods identified in the literature. The best results were obtained for small datasets, where the proposed data augmentation strategy has greater impact.
ISSN:	0031-3203 1873-5142
DOI:	10.1016/j.patcog.2017.10.033