Subspace Gaussian mixture model with state-dependent subspace dimensions

In recent years, under the hidden Markov modeling (HMM) framework, the use of subspace Gaussian mixture models (SGMMs) has demonstrated better recognition performance than traditional Gaussian mixture models (GMMs) in automatic speech recognition. In state-of-the-art SGMM formulation, a fixed subspa...

Full description

Saved in:
Bibliographic Details
Published in2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 1725 - 1729
Main Authors Ko, Tom, Mak, Brian, Cheung-Chi Leung
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.05.2014
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In recent years, under the hidden Markov modeling (HMM) framework, the use of subspace Gaussian mixture models (SGMMs) has demonstrated better recognition performance than traditional Gaussian mixture models (GMMs) in automatic speech recognition. In state-of-the-art SGMM formulation, a fixed subspace dimension is assigned to every phone states. While a constant subspace dimension is easier to implement, it may, however, lead to overfitting or underfitting of some state models as the data is usually distributed unevenly among the states. In a later extension of SGMM, states are split to sub-states with an appropriate objective function so that the problem is eased by increasing the state-specific parameters for the underfitting state. In this paper, we propose another solution and allow each sub-state to have a different subspace dimension depending on its amount of training frames so that the state-specific parameters can be robustly estimated. Experimental evaluation on the Switchboard recognition task shows that our proposed method brings improvement to the existing SGMM training procedure.
ISSN:1520-6149
2379-190X
DOI:10.1109/ICASSP.2014.6853893