Semi-Supervised Learning with Limited Data for Automatic Speech Recognition

In this paper, we analyze the performance of semi-supervised learning (SSL) methods for the automatic speech recognition (ASR) task. We focus on the case of model adaptation using small unlabeled datasets. The basic SSL method that we apply uses pseudo-labels generated by the adapted model itself, h...

Full description

Saved in:
Bibliographic Details
Published in2022 IEEE 7th Forum on Research and Technologies for Society and Industry Innovation (RTSI) pp. 136 - 141
Main Authors Pudo, Mikolaj, Szczepanek, Natalia, Lukasiak, Bozena, Janicki, Artur
Format Conference Proceeding
LanguageEnglish
Published IEEE 24.08.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this paper, we analyze the performance of semi-supervised learning (SSL) methods for the automatic speech recognition (ASR) task. We focus on the case of model adaptation using small unlabeled datasets. The basic SSL method that we apply uses pseudo-labels generated by the adapted model itself, however, we also propose and analyze a number of improvements to SSL. Furthermore, we investigate the possibility of using these methods on the datasets with the token distributions significantly different from the one represented by the training data. We show that in certain conditions, even very small amounts of data can improve the ASR model performance. Using the proposed SSL variant, we were able to reduce WER by 12-22%, depending on the dataset.
ISSN:2687-6817
DOI:10.1109/RTSI55261.2022.9905112