Automatic generation of speech synthesis units based on closed loop training

This paper proposes a new method for automatically generating speech synthesis units. A small set of synthesis units is selected from a large speech database by the proposed closed loop training method (CLT). Because CLT is based on the evaluation and minimization of the distortion caused by the syn...

Full description

Saved in:
Bibliographic Details
Published in1997 IEEE International Conference on Acoustics, Speech, and Signal Processing Vol. 2; pp. 963 - 966 vol.2
Main Authors Kagoshima, T., Akamine, M.
Format Conference Proceeding
LanguageEnglish
Published Washington DC IEEE 1997
IEEE Computer Society Press
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper proposes a new method for automatically generating speech synthesis units. A small set of synthesis units is selected from a large speech database by the proposed closed loop training method (CLT). Because CLT is based on the evaluation and minimization of the distortion caused by the synthesis process such as prosodic modification: the selected synthesis units are most suitable for synthesizers. The CLT is applied to a waveform concatenation based synthesizer, whose basic unit is CV/VC (diphone). It is shown that synthesis units can be efficiently generated by CLT from a labeled speech database with a small amount of computation. Moreover, the synthesized speech is clear and smooth even though the storage size of the waveform dictionary is small.
ISBN:0818679190
9780818679193
ISSN:1520-6149
2379-190X
DOI:10.1109/ICASSP.1997.596098