Bilingual acoustic modeling with state mapping and three-stage adaptation for transcribing unbalanced code-mixed lectures

This paper presents a bilingual acoustic modeling approach for transcribing Mandarin-English code-mixed lectures with highly unbalanced language distribution. Special terminologies for the content were produced in the guest language of English (about 15%) and embedded in the utterances produced in t...

Full description

Saved in:

Bibliographic Details
Published in	2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 5020 - 5023
Main Authors	Ching-Feng Yeh, Liang-Che Sun, Chao-Yu Huang, Lin-Shan Lee
Format	Conference Proceeding
Language	English
Published	IEEE 01.05.2011
Subjects	acoustic modeling adaptation Adaptation models bilingual code-mixing lecture Silicon state mapping Switches
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper presents a bilingual acoustic modeling approach for transcribing Mandarin-English code-mixed lectures with highly unbalanced language distribution. Special terminologies for the content were produced in the guest language of English (about 15%) and embedded in the utterances produced in the host language of Mandarin (about 85%). The code-mixing nature of the target corpus and the very small percentage of the English data made the task difficult. State mapping and merging approaches plus three stages of model adaptation handles the above problem. Significant improvements in recognition accuracy were obtained in the experiment with a real bilingual code-mixed lecture corpus recorded at National Taiwan University. The code-mixing situation considered is actually very natural in the spoken language of the daily lives of many people in the globalized world today.
ISBN:	9781457705380 1457705389
ISSN:	1520-6149 2379-190X
DOI:	10.1109/ICASSP.2011.5947484