Unsupervised hierarchical adaptation using reliable selection of cluster-dependent parameters

Adaptation of speaker-independent hidden Markov models (HMMs) to a new speaker using speaker-specific data is an effective approach to improve speech recognition performance for the enrolled speaker. Practically, it is desirable to flexibly perform the adaptation without any prior knowledge or limit...

Full description

Saved in:
Bibliographic Details
Published inSpeech communication Vol. 30; no. 4; pp. 235 - 253
Main Authors Chien, Jen-Tzung, Junqua, Jean-Claude
Format Journal Article
LanguageEnglish
Published Amsterdam Elsevier B.V 01.04.2000
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Adaptation of speaker-independent hidden Markov models (HMMs) to a new speaker using speaker-specific data is an effective approach to improve speech recognition performance for the enrolled speaker. Practically, it is desirable to flexibly perform the adaptation without any prior knowledge or limitation on the enrolled adaptation data (e.g. data transcription, length and content). However, the inevitable transcription errors may cause unreliability in the model adaptation (or transformation). The variable length and content of adaptation data usually make it necessary to dynamically control the degree of sharing in transformation-based adaptation. This paper presents an unsupervised hierarchical adaptation algorithm for flexible speaker adaptation. We build a tree structure of HMMs such that the control of transformation sharing can be achieved. To perform the unsupervised learning, we apply Bayesian theory to estimate the transformation parameters and data transcription. To select the parameters for hierarchical model transformation, we developed a new algorithm based on the maximum confidence measure (MCM) and minimum description length (MDL) criteria. Experimental comparisons on unsupervised speaker adaptation show that the hybrid adaptation scheme based on MCM and MDL criteria achieves the best recognition results for any lengths of enrollment data.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0167-6393
1872-7182
DOI:10.1016/S0167-6393(99)00052-7