ASR for low-resourced languages: Building a phonetically balanced Romanian speech corpus

The construction of automatic speech recognition (ASR) systems is fundamentally dependent on the speech corpus used to train the acoustic models. The speech corpus should be phonetically balanced to assure that the acoustic models are properly trained. This paper presents the design and development...

Full description

Saved in:
Bibliographic Details
Published in2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO) pp. 2060 - 2064
Main Authors Stanescu, Miruna, Cucu, H., Buzo, A., Burileanu, C.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.08.2012
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The construction of automatic speech recognition (ASR) systems is fundamentally dependent on the speech corpus used to train the acoustic models. The speech corpus should be phonetically balanced to assure that the acoustic models are properly trained. This paper presents the design and development of the first phonetically balanced Romanian speech corpus. It describes all the language processing steps taken in order to obtain a proper set of phrases, discusses some important aspects regarding Romanian phonetics and emphasizes the phrase selection mechanism.
ISBN:1467310689
9781467310680
ISSN:2219-5491
2219-5491