Hand gesture classification using time–frequency images and transfer learning based on CNN

Hand gesture-based systems are one of the most effective technological advances and continue to develop with improvements in the field of human–computer interaction. Surface electromyography (sEMG) is utilized as a popular input data for gesture classification with elevated accuracy and advanced con...

Full description

Saved in:

Bibliographic Details
Published in	Biomedical signal processing and control Vol. 77; p. 103787
Main Authors	Ozdemir, Mehmet Akif, Kisa, Deniz Hande, Guren, Onan, Akan, Aydin
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.08.2022
Subjects	Convolutional Neural Networks (CNN) CWT Electromyogram (EMG) Hand Gesture Classification Hilbert-Huang Transform (HHT) STFT Electromyogram (EMG) Hilbert-Huang Transform (HHT) CWT STFT Convolutional Neural Networks (CNN) Hand Gesture Classification
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Hand gesture-based systems are one of the most effective technological advances and continue to develop with improvements in the field of human–computer interaction. Surface electromyography (sEMG) is utilized as a popular input data for gesture classification with elevated accuracy and advanced control capability. This paper presents a comparative hand gesture classification approach using time–frequency (TF) images of the spontaneous sEMG signals and the transfer learning method. 4-channel sEMG signals are collected from 30 subjects performing 7 specific hand gestures. After the required pre-processing, segmentation, and windowing steps, three TF analysis methods, namely Short-Time Fourier Transform (STFT), Continuous Wavelet Transform (CWT), and Hilbert-Huang Transform (HHT), are applied to EMG signals to obtain TF images. Spectrograms from STFT, scalograms from CWT, and Hilbert-Huang spectra (HHS) from HHT obtained from multi-channel sEMG data are separately fused. TF images are then utilized to extract distinct features using seven state-of-the-art, pre-trained Convolutional Neural Network (CNN) architectures and classify seven hand gestures. Two different robust cross-validation strategies are conducted to evaluate the proposed method; stratified k-fold cross-validation (SKCV) and leave-one-subject-out cross-validation (LOOCV). We also investigate the effect of window size and the combination of Intrinsic Mode Functions (IMFs) on classification performance. The results demonstrated that the HHT utilizing IMFs obtained by Empirical Mode Decomposition (EMD) provided improved TF resolution and better results than STFT and CWT in the classification of sEMG signals. Finally, the best average accuracies (93.75% for SKCV) and (94.41% for LOOCV) are obtained by the HHT method with the pre-trained ResNet-50 model.
ISSN:	1746-8094 1746-8108
DOI:	10.1016/j.bspc.2022.103787