A new weighted multi-scale descriptor for hand gesture recognition

Image-based hand gesture recognition is a very challenging problem as the hand is a smaller object with complex articulations compared to the entire human body. It occupies a little portion in the image and is more easily affected by segmentation errors, and hence needs delicate description. This pa...

Full description

Saved in:

Bibliographic Details
Published in	Multimedia tools and applications Vol. 83; no. 14; pp. 43325 - 43347
Main Authors	Zhang, Beiwei, Ding, Wen, Ye, JiaSheng
Format	Journal Article
Language	English
Published	New York Springer US 01.04.2024 Springer Nature B.V
Subjects	Accuracy Algorithms Cameras Computer Communication Networks Computer Science Contours Data Structures and Information Theory Datasets Gesture recognition Image processing Machine learning Multimedia Multimedia Information Systems Neural networks Sensors Skin Smoothing Special Purpose and Application-Based Systems Track 3: Biometrics and HCI Gaussian smoothing WMD descriptor Prewitt operator Hand gesture recognition
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Image-based hand gesture recognition is a very challenging problem as the hand is a smaller object with complex articulations compared to the entire human body. It occupies a little portion in the image and is more easily affected by segmentation errors, and hence needs delicate description. This paper suggests a new weighted multi-scale feature descriptor (WMD) along the contour of the hand for robust hand gesture recognition using depth images. Firstly, the weight factor is estimated for each contour point by 2D Gaussian smoothing function and Prewitt operator to relate it with its neighbors and highlight its importance. Then the WMD descriptor is constructed via 1D left-side and right-side Gaussian smoothing considering the contour points are more sensitive than those inner points of the hand and depend on each other when used to recognize the gestures. Granularity of the descriptor is characterized by multiple scales with different standard deviations of the Gaussian function. And its invariants to translation, rotation and scaling transformations are proved theoretically and validated experimentally. Finally, extensive experiments on our self-established ten-gesture dataset and two public datasets have been carried out by comparing the proposed algorithm with three distance-based and two CNN-based hand gesture recognition methods. The encouraging results demonstrate that our method outperforms the others and achieves a good combination of accuracy (more than 95%) and computational efficiency (averaging 0.054s per frame).
ISSN:	1573-7721 1380-7501 1573-7721
DOI:	10.1007/s11042-023-17319-0