Unsupervised and semi-supervised adaptation of a hybrid speech recognition system
This paper evaluates a recently published method for supervised and unsupervised adaptation of neural networks used in hybrid speech recognition systems. The neural networks used in the field of hybrid speech recognition have certain distinct characteristics that make the usual adaptation methods (s...
Saved in:
Published in | 2012 IEEE 11th International Conference on Signal Processing Vol. 1; pp. 527 - 530 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.10.2012
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | This paper evaluates a recently published method for supervised and unsupervised adaptation of neural networks used in hybrid speech recognition systems. The neural networks used in the field of hybrid speech recognition have certain distinct characteristics that make the usual adaptation methods (such as retraining the neural network) unusable or ineffective. The recently published MELT (Minimum Error Linear Transform) method [1] has been developed to cope with this issue. By providing a way of establishing a link between the intermediate features and the long temporal features, the number of free variables can be reduced significantly and the resulting adaptation parameters can be estimated robustly. The experiments were performed on the WSJCAM0 speech corpus. Contrary to the original paper [1], the experiments were performed using a word recognizer instead of a phoneme recognizer. The experimental results suggest that the MELT method can be used both in an unsupervised as well as a semi-supervised manner and when applied, it leads to significant reduction of word error rate, even for a strong language model. |
---|---|
ISBN: | 9781467321969 1467321966 |
ISSN: | 2164-5221 |
DOI: | 10.1109/ICoSP.2012.6491542 |