A New Speech Recognition Model in a Human-Robot Interaction Scenario Using NAO Robot: Proposal and Preliminary Model

There are several terms for speech recognition. Auto speech recognition (ASR), speech-to-text, and computer speech recognition are all terms used to describe speech recognition. A single user's voice it is necessary to distinguish between speech recognition and voice recognition. The first is t...

Full description

Saved in:

Bibliographic Details
Published in	2021 International Conference on Communication & Information Technology (ICICT) pp. 215 - 220
Main Authors	Younis, Hussain A., Mohamed, A.S.A., Ab Wahab, M. N., Jamaludin, R., Salisu, Sani
Format	Conference Proceeding
Language	English
Published	IEEE 05.06.2021
Subjects	Feature extraction Hidden Markov models Linguistics NAO-robot Natural Language Processing (NLP) Neural networks Speech recognition Text recognition Tokenization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	There are several terms for speech recognition. Auto speech recognition (ASR), speech-to-text, and computer speech recognition are all terms used to describe speech recognition. A single user's voice it is necessary to distinguish between speech recognition and voice recognition. The first is to translate speech into text, such as, the audible voice and concept (human speech), and the second is to define only sound, such as, animal sound, car, etc. There is no algorithm that is specifically designed for this field; instead, techniques such as N-grams and neural networks are used to explain and treat this type. Natural Language Processing (NLP), Hidden Markov Model (HMM), and Speaker Diarization (SD). The last type would be addressed in my work. Natural language processing is a computational technique that can be used and applied to various levels of linguistic analysis (dare, deep analysis) to represent natural language in a useful or more representation. It is still possible to improve current recognition and identification systems in order to achieve greater accuracy. A new approach has been proposed that distinguishes speech in four stages: speech recognition, tokenization, extracting features of speech from texts, and part speech: The three patterns of Name Entity Recognition (NER), followed by the possibility of implementing the proposed model It achieved more accurate and applied results in an educational environment by using a NAO-robot.
DOI:	10.1109/ICICT52195.2021.9568457