Large-scale gesture recognition with a fusion of RGB-D data based on the C3D model

The gesture recognition has raised attention in computer vision owing to its many applications. However, video-based large-scale gesture recognition still faces many challenges, since many factors like background may disturb the accuracy. To achieve gesture recognition with large-scale videos, we pr...

Full description

Saved in:
Bibliographic Details
Published in2016 23rd International Conference on Pattern Recognition (ICPR) pp. 25 - 30
Main Authors Yunan Li, Qiguang Miao, Kuan Tian, Yingying Fan, Xin Xu, Rui Li, Jianfeng Song
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2016
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The gesture recognition has raised attention in computer vision owing to its many applications. However, video-based large-scale gesture recognition still faces many challenges, since many factors like background may disturb the accuracy. To achieve gesture recognition with large-scale videos, we propose a method based on RGB-D data. To learn gesture details better, the inputs are expanded into 32-frame videos first, and then the RGB and depth videos are sent to the C3D model to extract spatiotemporal features respectively. Next these features are combined to boost the performance, which can also avoid unreasonable synthetic data due to the uniform dimension of C3D features. Our approach achieves 49.2% accuracy on the validation subset of the Chalearn LAP IsoGD Database just with a linear SVM classifier. It also outperforms the baseline and other methods in the challenge and wins the first place at 56.9% on testing set.
DOI:10.1109/ICPR.2016.7899602