CNN based feature extraction and classification for sign language

Hand gesture is one of the most prominent ways of communication since the beginning of the human era. Hand gesture recognition extends human-computer interaction (HCI) more convenient and flexible. Therefore, it is important to identify each character correctly for calm and error-free HCI. Literatur...

Full description

Saved in:
Bibliographic Details
Published inMultimedia tools and applications Vol. 80; no. 2; pp. 3051 - 3069
Main Authors Barbhuiya, Abul Abbas, Karsh, Ram Kumar, Jain, Rahul
Format Journal Article
LanguageEnglish
Published New York Springer US 01.01.2021
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Hand gesture is one of the most prominent ways of communication since the beginning of the human era. Hand gesture recognition extends human-computer interaction (HCI) more convenient and flexible. Therefore, it is important to identify each character correctly for calm and error-free HCI. Literature survey reveals that most of the existing hand gesture recognition (HGR) systems have considered only a few simple discriminating gestures for recognition performance. This paper applies deep learning-based convolutional neural networks (CNNs) for robust modeling of static signs in the context of sign language recognition. In this work, CNN is employed for HGR where both alphabets and numerals of ASL are considered simultaneously. The pros and cons of CNNs used for HGR are also highlighted. The CNN architecture is based on modified AlexNet and modified VGG16 models for classification. Modified pre-trained AlexNet and modified pre-trained VGG16 based architectures are used for feature extraction followed by a multiclass support vector machine (SVM) classifier. The results are evaluated based on different layer features for best recognition performance. To examine the accuracy of the HGR schemes, both the leave-one-subject-out and a random 70–30 form of cross-validation approach were adopted. This work also highlights the recognition accuracy of each character, and their similarities with identical gestures. The experiments are performed in a simple CPU system instead of high-end GPU systems to demonstrate the cost-effectiveness of this work. The proposed system has achieved a recognition accuracy of 99.82%, which is better than some of the state-of-art methods.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1380-7501
1573-7721
DOI:10.1007/s11042-020-09829-y