Voice conversion using conditional restricted Boltzmann machine

In this paper, we proposed a new method for voice conversion using conditional restricted Boltzmann machine (Conditional RBM, CRBM). The joint distribution of source and target acoustic features are modeled by the RBM part of the model. Short-term temporal constraints are introduced by conditioning...

Full description

Saved in:
Bibliographic Details
Published in2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP) pp. 110 - 114
Main Authors Fengyun Zhu, Ziye Fan, Xihong Wu
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.07.2014
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this paper, we proposed a new method for voice conversion using conditional restricted Boltzmann machine (Conditional RBM, CRBM). The joint distribution of source and target acoustic features are modeled by the RBM part of the model. Short-term temporal constraints are introduced by conditioning on contextual frames, say, the past and future frames of the source speaker. In contrast to conventional methods, temporal structure of the data could be modeled without using dynamic features. Objective and subjective experiments were conducted to evaluate the method. Experimental results show that short-term temporal structure could be modeled well by CRBM, and the proposed method outperforms conventional joint density Gaussian mixture models based method significantly.
ISBN:9781479954018
1479954012
DOI:10.1109/ChinaSIP.2014.6889212