Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks

Automatic detection and classification of dynamic hand gestures in real-world systems intended for human computer interaction is challenging as: 1) there is a large diversity in how people perform gestures, making detection and classification difficult, 2) the system must work online in order to avo...

Full description

Saved in:
Bibliographic Details
Published in2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 4207 - 4215
Main Authors Molchanov, Pavlo, Xiaodong Yang, Gupta, Shalini, Kihwan Kim, Tyree, Stephen, Kautz, Jan
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2016
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Automatic detection and classification of dynamic hand gestures in real-world systems intended for human computer interaction is challenging as: 1) there is a large diversity in how people perform gestures, making detection and classification difficult, 2) the system must work online in order to avoid noticeable lag between performing a gesture and its classification, in fact, a negative lag (classification before the gesture is finished) is desirable, as feedback to the user can then be truly instantaneous. In this paper, we address these challenges with a recurrent three-dimensional convolutional neural network that performs simultaneous detection and classification of dynamic hand gestures from multi-modal data. We employ connectionist temporal classification to train the network to predict class labels from inprogress gestures in unsegmented input streams. In order to validate our method, we introduce a new challenging multimodal dynamic hand gesture dataset captured with depth, color and stereo-IR sensors. On this challenging dataset, our gesture recognition system achieves an accuracy of 83:8%, outperforms competing state-of-the-art algorithms, and approaches human accuracy of 88:4%. Moreover, our method achieves state-of-the-art performance on SKIG and ChaLearn2014 benchmarks.
ISSN:1063-6919
DOI:10.1109/CVPR.2016.456