Investigation of Transfer Learning for End-to-End Russian Speech Recognition

End-to-end speech recognition systems reduce the speech decoding time and required amount of memory comparing to standard systems. However they need much more data for training, which complicates creation of such systems for low-resourced languages. One way to improve performance of end-to-end low-r...

Full description

Saved in:
Bibliographic Details
Published inSpeech and Computer Vol. 13721; pp. 349 - 357
Main Author Kipyatkova, Irina
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2022
Springer International Publishing
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text
ISBN3031209796
9783031209796
ISSN0302-9743
1611-3349
DOI10.1007/978-3-031-20980-2_30

Cover

Loading…
More Information
Summary:End-to-end speech recognition systems reduce the speech decoding time and required amount of memory comparing to standard systems. However they need much more data for training, which complicates creation of such systems for low-resourced languages. One way to improve performance of end-to-end low-resourced speech recognition system is model’s pre-training by transfer learning, that is training the model on the non-target data and then transferring the trained parameters to the target model. The aim of the current research was to investigate application of transfer learning to the training of the end-to-end Russian speech recognition system in low-resourced conditions. We used several speech corpora of different languages for pre-training. Then end-to-end model was fine-tuned on a small Russian speech corpus of 60 h. We conducted experiments on application of transfer learning in different parts of the model (feature extraction block, encoder, and attention mechanism) as well as on freezing of the lower layers. We have achieved 24.53% relative word error rate reduction comparing to the baseline system trained without transfer learning.
ISBN:3031209796
9783031209796
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-031-20980-2_30