An automatic method for learning a Japanese lexicon for recognition of spontaneous speech

When developing a speech recognition system, one must start by deciding what the units to be recognized should be. This is for the most part a straightforward choice in the case of word-based languages such as English, but becomes an issue even in handling languages with a complex compounding system...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181) Vol. 1; pp. 305 - 308 vol.1
Main Authors	Mayfield Tomokiyo, L., Ries, K.
Format	Conference Proceeding
Language	English
Published	IEEE 1998
Subjects	Automatic speech recognition Character recognition Dictionaries Humans Indium tin oxide Mutual information Natural languages Performance evaluation Speech recognition Training data
Online Access	Get full text

Cover

Loading…

More Information
Summary:	When developing a speech recognition system, one must start by deciding what the units to be recognized should be. This is for the most part a straightforward choice in the case of word-based languages such as English, but becomes an issue even in handling languages with a complex compounding system like German; with an agglutinative language like Japanese, which provides no spaces in written text, the choice is not at all obvious. Once an appropriate unit has been determined, the problem of consistently segmenting transcriptions of training data must be addressed. This paper describes a method for learning a lexicon from a training corpus which contains no word-level segmentation, applied to the problem of building a Japanese speech recognition system. We show not only that one can satisfactorily segment transcribed training data automatically, avoiding human error, but also that our system, when trained with the automatically segmented corpus, showed a significant improvement in recognition performance.
ISBN:	9780780344280 0780344286
ISSN:	1520-6149 2379-190X
DOI:	10.1109/ICASSP.1998.674428