Voice conversion using conditional restricted Boltzmann machine

In this paper, we proposed a new method for voice conversion using conditional restricted Boltzmann machine (Conditional RBM, CRBM). The joint distribution of source and target acoustic features are modeled by the RBM part of the model. Short-term temporal constraints are introduced by conditioning...

Full description

Saved in:

Bibliographic Details
Published in	2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP) pp. 110 - 114
Main Authors	Fengyun Zhu, Ziye Fan, Xihong Wu
Format	Conference Proceeding
Language	English
Published	IEEE 01.07.2014
Subjects	Acoustics Conditional restricted Boltzmann machine Data models Joints Speech Training Vectors Voice conversion
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we proposed a new method for voice conversion using conditional restricted Boltzmann machine (Conditional RBM, CRBM). The joint distribution of source and target acoustic features are modeled by the RBM part of the model. Short-term temporal constraints are introduced by conditioning on contextual frames, say, the past and future frames of the source speaker. In contrast to conventional methods, temporal structure of the data could be modeled without using dynamic features. Objective and subjective experiments were conducted to evaluate the method. Experimental results show that short-term temporal structure could be modeled well by CRBM, and the proposed method outperforms conventional joint density Gaussian mixture models based method significantly.
ISBN:	9781479954018 1479954012
DOI:	10.1109/ChinaSIP.2014.6889212