EMG-based speech recognition using dimensionality reduction methods

Automatic speech recognition is the main form of man–machine communication. Recently, several studies have shown the ability to automatically recognize speech based on electromyography (EMG) signals of the facial muscles using machine learning methods. The objective of this study was to utilize mach...

Full description

Saved in:

Bibliographic Details
Published in	Journal of ambient intelligence and humanized computing Vol. 14; no. 1; pp. 597 - 607
Main Authors	Ratnovsky, Anat, Malayev, Sarit, Ratnovsky, Shahar, Naftali, Sara, Rabin, Neta
Format	Journal Article
Language	English
Published	Berlin/Heidelberg Springer Berlin Heidelberg 01.01.2023 Springer Nature B.V
Subjects	Acoustics Affine transformations Algorithms Artificial Intelligence Automatic speech recognition Classification Computational Intelligence Datasets Electromyography Engineering Fourier transforms Machine learning Muscles Neck Original Research Phonetics Principal components analysis Robotics and Automation Sensors Speaking Speech Speech recognition User Interfaces and Human Computer Interaction Voice recognition Words Words (language) Feature extraction Machine learning algorithms Electromyography Speech recognition Automatic speech recognition Principal component analysis
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Automatic speech recognition is the main form of man–machine communication. Recently, several studies have shown the ability to automatically recognize speech based on electromyography (EMG) signals of the facial muscles using machine learning methods. The objective of this study was to utilize machine learning methods for automatic identification of speech based on EMG signals. EMG signals from three facial muscles were measured from four healthy female subjects while pronouncing seven different words 50 times. Short time Fourier transform features were extracted from the EMG data. Principle component analysis (PCA) and locally linear embedding (LLE) methods were applied and compared for reducing the dimensions of the EMG data. K-nearest-neighbors was used to examine the ability to identify different word sets of a subject based on his own dataset, and to identify words of one subject based on another subject's dataset, utilizing an affine transformation for aligning between the reduced feature spaces of two subjects. The PCA and LLE achieved average recognizing rate of 81% for five words-sets in the single-subject approach. The best average recognition success rates for three and five words-sets were 88.8% and 74.6%, respectively, for the multi-subject classification approach. Both the PCA and LLE achieved satisfactory classification rates for both the single-subject and multi-subject approaches. The multi-subject classification approach enables robust classification of words recorded from a new subject based on another subject’s dataset and thus can be applicable for people who have lost their ability to speak.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1868-5137 1868-5145
DOI:	10.1007/s12652-021-03315-5