fMPE: discriminatively trained features for speech recognition
MPE (minimum phone error) is a previously introduced technique for discriminative training of HMM parameters. fMPE applies the same objective function to the features, transforming the data with a kernel-like method and training millions of parameters, comparable to the size of the acoustic model. D...
Saved in:
Published in | Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005 Vol. 1; pp. I/961 - I/964 Vol. 1 |
---|---|
Main Authors | , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
2005
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | MPE (minimum phone error) is a previously introduced technique for discriminative training of HMM parameters. fMPE applies the same objective function to the features, transforming the data with a kernel-like method and training millions of parameters, comparable to the size of the acoustic model. Despite the large number of parameters, fMPE is robust to over-training. The method is to train a matrix projecting from posteriors of Gaussians to a normal size feature space, and then to add the projected features to normal features such as PLP. The matrix is trained from a zero start using a linear method. Sparsity of posteriors ensures speed in both training and test time. The technique gives similar improvements to MPE (around 10% relative). MPE on top of fMPE results in error rates up to 6.5% relative better than MPE alone, or more if multiple layers of transform are trained. |
---|---|
ISBN: | 9780780388741 0780388747 |
ISSN: | 1520-6149 2379-190X |
DOI: | 10.1109/ICASSP.2005.1415275 |