A First LVCSR System for Luxembourgish, a Low-Resourced European Language

Luxembourgish is embedded in a multilingual context on the divide between Romance and Germanic cultures and remains one of Europe’s low-resourced languages. We describe our efforts in building a large vocabulary ASR system for such a “minority” language without resorting to any prior transcribed aud...

Full description

Saved in:
Bibliographic Details
Published inHuman Language Technology Challenges for Computer Science and Linguistics Vol. 8387; pp. 479 - 490
Main Authors Adda-Decker, Martine, Lamel, Lori, Adda, Gilles, Lavergne, Thomas
Format Book Chapter
LanguageEnglish
Published Cham Springer International Publishing 2014
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Luxembourgish is embedded in a multilingual context on the divide between Romance and Germanic cultures and remains one of Europe’s low-resourced languages. We describe our efforts in building a large vocabulary ASR system for such a “minority” language without resorting to any prior transcribed audio training data. Instead, acoustic models are derived from major European languages. Furthermore, most Luxembourgish written sources include significant parts in other languages. This poses specific challenges to Language Model estimation. Some scientific and technological issues addressed include: (i) how to build acoustic models if no labeled acoustic training data are available for the under-resourced target language? (ii) how to make use of the new system to accelerate resource production for the target language? (iii) how to build a vocabulary and a language model with multilingual written texts? (iv) how to determine the “best” phonemic inventory for ASR? First ASR results illustrate the accuracy of the various sets of monolingual and multilingual acoustic models and what these suggest concerning language typology issues.
ISBN:9783319089577
3319089579
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-319-08958-4_39