Repackagingaugment: Overcoming Prediction Error Amplification in Weight-Averaged Speech Recognition Models Subject to Self-Training
Representation-based speech recognition models have demonstrated state-of-the-art performance on downstream tasks. These models are pre-trained on large-scale unlabeled data, fine-tuned on a small amount of labeled data, and subsequently advanced via the self-training procedure by leveraging pseudo-...
Saved in:
Published in | Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) pp. 1 - 5 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
04.06.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Representation-based speech recognition models have demonstrated state-of-the-art performance on downstream tasks. These models are pre-trained on large-scale unlabeled data, fine-tuned on a small amount of labeled data, and subsequently advanced via the self-training procedure by leveraging pseudo-labels. However, a self-trained representation model produces prediction errors caused by training with incorrect labels in the pseudo-labeled data. Weight-averaging methods have been employed to refine the pseudo-labels in a variety of studies; however, these methods amplify the prediction errors of each self-trained model. To alleviate this problem, we propose RepackagingAugment, a data augmentation method that improves the diversity of models while preventing the same incorrect labels from recursively occurring in every epoch. Our data augmentation deconstructs the paired speech-text data into word units and repackages them into a randomly determined number of word sequences. This strategy induces the models to produce different prediction errors by mitigating the problem of incorrect label over-fitting. Through various experiments on representation models, such as wav2vec 2.0 and data2vec, we demonstrate that our approach improves the performance of weight-averaged models. |
---|---|
ISSN: | 2379-190X |
DOI: | 10.1109/ICASSP49357.2023.10096146 |