Explore Efficient Local Features from RGB-D Data for One-Shot Learning Gesture Recognition

Availability of handy RGB-D sensors has brought about a surge of gesture recognition research and applications. Among various approaches, one shot learning approach is advantageous because it requires minimum amount of data. Here, we provide a thorough review about one-shot learning gesture recognit...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on pattern analysis and machine intelligence Vol. 38; no. 8; pp. 1626 - 1639
Main Authors Jun Wan, Guodong Guo, Li, Stan Z.
Format Journal Article
LanguageEnglish
Published United States IEEE 01.08.2016
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Availability of handy RGB-D sensors has brought about a surge of gesture recognition research and applications. Among various approaches, one shot learning approach is advantageous because it requires minimum amount of data. Here, we provide a thorough review about one-shot learning gesture recognition from RGB-D data and propose a novel spatiotemporal feature extracted from RGB-D data, namely mixed features around sparse keypoints (MFSK). In the review, we analyze the challenges that we are facing, and point out some future research directions which may enlighten researchers in this field. The proposed MFSK feature is robust and invariant to scale, rotation and partial occlusions. To alleviate the insufficiency of one shot training samples, we augment the training samples by artificially synthesizing versions of various temporal scales, which is beneficial for coping with gestures performed at varying speed. We evaluate the proposed method on the Chalearn gesture dataset (CGD). The results show that our approach outperforms all currently published approaches on the challenging data of CGD, such as translated, scaled and occluded subsets. When applied to the RGB-D datasets that are not one-shot (e.g., the Cornell Activity Dataset-60 and MSR Daily Activity 3D dataset), the proposed feature also produces very promising results under leave-one-out cross validation or one-shot learning.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Review-3
content type line 23
ISSN:0162-8828
1939-3539
2160-9292
1939-3539
DOI:10.1109/TPAMI.2015.2513479