Deep Learning-Based Hand Gesture Recognition System and Design of a Human–Machine Interface

Hand gesture recognition plays an important role in developing effective human–machine interfaces (HMIs) that enable direct communication between humans and machines. But in real-time scenarios, it is difficult to identify the correct hand gesture to control an application while moving the hands. To...

Full description

Saved in:

Bibliographic Details
Published in	Neural processing letters Vol. 55; no. 9; pp. 12569 - 12596
Main Authors	Sen, Abir, Mishra, Tapas Kumar, Dash, Ratnakar
Format	Journal Article
Language	English
Published	New York Springer US 01.12.2023 Springer Nature B.V
Subjects	Algorithms Artificial Intelligence Artificial neural networks Classification Complex Systems Computational Intelligence Computer Science Computer vision Control systems Deep learning Electromyography Gesture recognition Human-computer interface Interactive control Kalman filters Machine learning Man-machine interfaces Multimedia Real time Skin Support vector machines Deep learning Vision transformer Segmentation Transfer learning Kalman filter Virtual mouse Human machine interface Hand gesture recognition
Online Access	Get full text
ISSN	1370-4621 1573-773X
DOI	10.1007/s11063-023-11433-8

Cover

More Information
Summary:	Hand gesture recognition plays an important role in developing effective human–machine interfaces (HMIs) that enable direct communication between humans and machines. But in real-time scenarios, it is difficult to identify the correct hand gesture to control an application while moving the hands. To address this issue, in this work, a low-cost hand gesture recognition system based human-computer interface (HCI) is presented in real-time scenarios. The system consists of six stages: (1) hand detection, (2) gesture segmentation, (3) feature extraction and gesture classification using five pre-trained convolutional neural network models (CNN) and vision transformer (ViT), (4) building an interactive human–machine interface (HMI), (5) development of a gesture-controlled virtual mouse, (6) smoothing of virtual mouse pointer using of Kalman filter. In our work, five pre-trained CNN models (VGG16, VGG19, ResNet50, ResNet101, and Inception-V1) and ViT have been employed to classify hand gesture images. Two multi-class datasets (one public and one custom) have been used to validate the models. Considering the model’s performances, it is observed that Inception-V1 has significantly shown a better classification performance compared to the other four CNN models and ViT in terms of accuracy, precision, recall, and F-score values. We have also expanded this system to control some multimedia applications (such as VLC player, audio player, playing 2D Super-Mario-Bros game, etc.) with different customized gesture commands in real-time scenarios. The average speed of this system has reached 25 fps (frames per second), which meets the requirements for the real-time scenario. Performance of the proposed gesture control system obtained the average response time in milisecond for each control which makes it suitable for real-time. This model (prototype) will benefit physically disabled people interacting with desktops.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1370-4621 1573-773X
DOI:	10.1007/s11063-023-11433-8