On the use of lattices for the automatic generation of pronunciations

In this paper, we explore the use of lattices to generate pronunciations for speech recognition based on the observation of a few (say one or two) speech utterances of a word. Various search strategies are investigated in combination with schemes where single or multiple pronunciations are generated...

Full description

Saved in:

Bibliographic Details
Published in	2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03) Vol. 1; p. I
Main Authors	Deligne, S., Mangu, L.
Format	Conference Proceeding
Language	English
Published	IEEE 2003
Subjects	Cepstral analysis Context modeling Decision trees Gaussian distribution Gaussian processes Lattices Merging Speech recognition Viterbi algorithm Vocabulary
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we explore the use of lattices to generate pronunciations for speech recognition based on the observation of a few (say one or two) speech utterances of a word. Various search strategies are investigated in combination with schemes where single or multiple pronunciations are generated for each speech utterance. In our experiments, a strategy that combines merging time-overlapping links in a context-dependent subphone lattice and generating multiple pronunciations provides the best recognition accuracy. This results in average relative gains of 30% over the generation of single pronunciations using a Viterbi search.
ISBN:	9780780376632 0780376633
ISSN:	1520-6149 2379-190X
DOI:	10.1109/ICASSP.2003.1198752