EMG-based speech recognition using dimensionality reduction methods
Automatic speech recognition is the main form of man–machine communication. Recently, several studies have shown the ability to automatically recognize speech based on electromyography (EMG) signals of the facial muscles using machine learning methods. The objective of this study was to utilize mach...
Saved in:
Published in | Journal of ambient intelligence and humanized computing Vol. 14; no. 1; pp. 597 - 607 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Berlin/Heidelberg
Springer Berlin Heidelberg
01.01.2023
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Automatic speech recognition is the main form of man–machine communication. Recently, several studies have shown the ability to automatically recognize speech based on electromyography (EMG) signals of the facial muscles using machine learning methods. The objective of this study was to utilize machine learning methods for automatic identification of speech based on EMG signals. EMG signals from three facial muscles were measured from four healthy female subjects while pronouncing seven different words 50 times. Short time Fourier transform features were extracted from the EMG data. Principle component analysis (PCA) and locally linear embedding (LLE) methods were applied and compared for reducing the dimensions of the EMG data. K-nearest-neighbors was used to examine the ability to identify different word sets of a subject based on his own dataset, and to identify words of one subject based on another subject's dataset, utilizing an affine transformation for aligning between the reduced feature spaces of two subjects. The PCA and LLE achieved average recognizing rate of 81% for five words-sets in the single-subject approach. The best average recognition success rates for three and five words-sets were 88.8% and 74.6%, respectively, for the multi-subject classification approach. Both the PCA and LLE achieved satisfactory classification rates for both the single-subject and multi-subject approaches. The multi-subject classification approach enables robust classification of words recorded from a new subject based on another subject’s dataset and thus can be applicable for people who have lost their ability to speak. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 1868-5137 1868-5145 |
DOI: | 10.1007/s12652-021-03315-5 |