A Light Implementation of a 3D Convolutional Network for Online Gesture Recognition

With the advancement of machine learning techniques and the increased accessibility to computing power, Artificial Neural Networks (ANNs) have achieved state-of-the-art results in image classification and, most recently, in video classification. The possibility of gesture recognition from a video so...

Full description

Saved in:
Bibliographic Details
Published inRevista IEEE América Latina Vol. 18; no. 2; pp. 319 - 326
Main Authors Brandolt Baldissera, Fabio, Vargas, Fabian Luis
Format Journal Article
LanguageEnglish
Published Los Alamitos The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 01.02.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:With the advancement of machine learning techniques and the increased accessibility to computing power, Artificial Neural Networks (ANNs) have achieved state-of-the-art results in image classification and, most recently, in video classification. The possibility of gesture recognition from a video source enables a more natural non-contact human-machine interaction, immersion when interacting in virtual reality environments and can even lead to sign language translation in the near future. However, the techniques utilized in video classification are usually computationally expensive, being prohibitive to conventional hardware. This work aims to study and analyze the applicability of continuous online gesture recognition techniques for embedded systems. This goal is achieved by proposing a new model based on 2D and 3D CNNs able to perform online gesture recognition, i.e. yielding a label while the video frames are still being processed, in a predictive manner, before having access to future frames of the video. This technique is of paramount interest to applications in which the video is being acquired concomitantly to the classification process and the issuing of the labels has a strict deadline. The proposed model was tested against three representative gesture datasets found in the literature. The obtained results suggest the proposed technique improves the state-of-the-art by yielding a quick gesture recognition process while presenting a high accuracy, which is fundamental for the applicability of embedded systems.
ISSN:1548-0992
1548-0992
DOI:10.1109/TLA.2019.9082244