A new weighted multi-scale descriptor for hand gesture recognition
Image-based hand gesture recognition is a very challenging problem as the hand is a smaller object with complex articulations compared to the entire human body. It occupies a little portion in the image and is more easily affected by segmentation errors, and hence needs delicate description. This pa...
Saved in:
Published in | Multimedia tools and applications Vol. 83; no. 14; pp. 43325 - 43347 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
New York
Springer US
01.04.2024
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Image-based hand gesture recognition is a very challenging problem as the hand is a smaller object with complex articulations compared to the entire human body. It occupies a little portion in the image and is more easily affected by segmentation errors, and hence needs delicate description. This paper suggests a new weighted multi-scale feature descriptor (WMD) along the contour of the hand for robust hand gesture recognition using depth images. Firstly, the weight factor is estimated for each contour point by 2D Gaussian smoothing function and Prewitt operator to relate it with its neighbors and highlight its importance. Then the WMD descriptor is constructed via 1D left-side and right-side Gaussian smoothing considering the contour points are more sensitive than those inner points of the hand and depend on each other when used to recognize the gestures. Granularity of the descriptor is characterized by multiple scales with different standard deviations of the Gaussian function. And its invariants to translation, rotation and scaling transformations are proved theoretically and validated experimentally. Finally, extensive experiments on our self-established ten-gesture dataset and two public datasets have been carried out by comparing the proposed algorithm with three distance-based and two CNN-based hand gesture recognition methods. The encouraging results demonstrate that our method outperforms the others and achieves a good combination of accuracy (more than 95%) and computational efficiency (averaging 0.054s per frame). |
---|---|
ISSN: | 1573-7721 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-023-17319-0 |