CNN based feature extraction and classification for sign language

Hand gesture is one of the most prominent ways of communication since the beginning of the human era. Hand gesture recognition extends human-computer interaction (HCI) more convenient and flexible. Therefore, it is important to identify each character correctly for calm and error-free HCI. Literatur...

Full description

Saved in:

Bibliographic Details
Published in	Multimedia tools and applications Vol. 80; no. 2; pp. 3051 - 3069
Main Authors	Barbhuiya, Abul Abbas, Karsh, Ram Kumar, Jain, Rahul
Format	Journal Article
Language	English
Published	New York Springer US 01.01.2021 Springer Nature B.V
Subjects	Accuracy Artificial neural networks Character recognition Classification Computer Communication Networks Computer Science Data Structures and Information Theory Error correction Feature extraction Feature recognition Gesture recognition Human-computer interaction Human-computer interface Literature reviews Machine learning Multimedia Information Systems Sign language Special Purpose and Application-Based Systems Support vector machines System effectiveness CNN Human–computer interface (HCI) Hand gesture American sign language (ASL)
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Hand gesture is one of the most prominent ways of communication since the beginning of the human era. Hand gesture recognition extends human-computer interaction (HCI) more convenient and flexible. Therefore, it is important to identify each character correctly for calm and error-free HCI. Literature survey reveals that most of the existing hand gesture recognition (HGR) systems have considered only a few simple discriminating gestures for recognition performance. This paper applies deep learning-based convolutional neural networks (CNNs) for robust modeling of static signs in the context of sign language recognition. In this work, CNN is employed for HGR where both alphabets and numerals of ASL are considered simultaneously. The pros and cons of CNNs used for HGR are also highlighted. The CNN architecture is based on modified AlexNet and modified VGG16 models for classification. Modified pre-trained AlexNet and modified pre-trained VGG16 based architectures are used for feature extraction followed by a multiclass support vector machine (SVM) classifier. The results are evaluated based on different layer features for best recognition performance. To examine the accuracy of the HGR schemes, both the leave-one-subject-out and a random 70–30 form of cross-validation approach were adopted. This work also highlights the recognition accuracy of each character, and their similarities with identical gestures. The experiments are performed in a simple CPU system instead of high-end GPU systems to demonstrate the cost-effectiveness of this work. The proposed system has achieved a recognition accuracy of 99.82%, which is better than some of the state-of-art methods.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1380-7501 1573-7721
DOI:	10.1007/s11042-020-09829-y