Semi-Supervised Learning with Limited Data for Automatic Speech Recognition

In this paper, we analyze the performance of semi-supervised learning (SSL) methods for the automatic speech recognition (ASR) task. We focus on the case of model adaptation using small unlabeled datasets. The basic SSL method that we apply uses pseudo-labels generated by the adapted model itself, h...

Full description

Saved in:

Bibliographic Details
Published in	2022 IEEE 7th Forum on Research and Technologies for Society and Industry Innovation (RTSI) pp. 136 - 141
Main Authors	Pudo, Mikolaj, Szczepanek, Natalia, Lukasiak, Bozena, Janicki, Artur
Format	Conference Proceeding
Language	English
Published	IEEE 24.08.2022
Subjects	Adaptation models Analytical models Industries Mozilla dataset on-device ASR Predictive models semi-supervised learning Semisupervised learning speech recognition Technological innovation Training data
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we analyze the performance of semi-supervised learning (SSL) methods for the automatic speech recognition (ASR) task. We focus on the case of model adaptation using small unlabeled datasets. The basic SSL method that we apply uses pseudo-labels generated by the adapted model itself, however, we also propose and analyze a number of improvements to SSL. Furthermore, we investigate the possibility of using these methods on the datasets with the token distributions significantly different from the one represented by the training data. We show that in certain conditions, even very small amounts of data can improve the ASR model performance. Using the proposed SSL variant, we were able to reduce WER by 12-22%, depending on the dataset.
ISSN:	2687-6817
DOI:	10.1109/RTSI55261.2022.9905112