A Study on Speech Recognition by a Neural Network Based on English Speech Feature Parameters

In this study, from the perspective of English speech feature parameters, two feature parameters, the mel-frequency cepstral coefficient (MFCC) and filter bank (Fbank), were selected to identify English speech. The algorithms used for recognition employed the classical back-propagation neural networ...

Full description

Saved in:

Bibliographic Details
Published in	Journal of advanced computational intelligence and intelligent informatics Vol. 28; no. 3; pp. 679 - 684
Main Authors	Mao, Congmin, Liu, Sujing
Format	Journal Article
Language	English
Published	Tokyo Fuji Technology Press Co. Ltd 01.05.2024
Subjects	Algorithms Artificial neural networks Back propagation networks Filter banks Neural networks Parameters Recurrent neural networks Speech Speech recognition Voice recognition
Online Access	Get full text
ISSN	1343-0130 1883-8014
DOI	10.20965/jaciii.2024.p0679

Cover

Loading…

More Information
Summary:	In this study, from the perspective of English speech feature parameters, two feature parameters, the mel-frequency cepstral coefficient (MFCC) and filter bank (Fbank), were selected to identify English speech. The algorithms used for recognition employed the classical back-propagation neural network (BPNN), recurrent neural network (RNN), and long short-term memory (LSTM) that were obtained by improving RNN. The three recognition algorithms were compared in the experiments, and the effects of the two feature parameters on the performance of the recognition algorithms were also compared. The LSTM model had the best identification performance among the three neural networks under different experimental environments; the neural network model using the MFCC feature parameter outperformed the neural network using the Fbank feature parameter; the LSTM model had the highest correct rate and the highest speed, while the RNN model ranked second, and the BPNN model ranked worst. The results confirm that the application of the LSTM model in combination with MFCC feature parameter extraction to English speech recognition can achieve higher speech recognition accuracy compared to other neural networks.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1343-0130 1883-8014
DOI:	10.20965/jaciii.2024.p0679