A CONTROLLABLE CROSS-GENDER VOICE CONVERSION FOR SOCIAL ROBOT
In this study, we propose a conversion intensity controllable model for voice conversion (VC) 1. . In particular, we combine the CycleGAN and transformer module, and build a condition embedding network as a control parameter. The model is first pre-trained with self-supervised learning on the voice...
Saved in:
Published in | 2022 10th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW) pp. 1 - 4 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English Japanese |
Published |
IEEE
18.10.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In this study, we propose a conversion intensity controllable model for voice conversion (VC) 1. . In particular, we combine the CycleGAN and transformer module, and build a condition embedding network as a control parameter. The model is first pre-trained with self-supervised learning on the voice reconstruction task, with the condition set to male-to-male or female-to-female. Then, we retrain the model on the cross-gender voice conversion task after the pretraining is completed, with the condition set to male-to-female or female-to-male. In the testing procedure, the condition is expected to be employed as a controllable parameter (scale). The proposed method was evaluated on the Voice Conversion Challenge dataset and compared to two baselines (CycleGAN, CycleTransGAN) with objective and subjective evaluations. The results show that our proposed model is able to convert voice with competitive performance, with the additional function of cross-gender controllability. |
---|---|
DOI: | 10.1109/ACIIW57231.2022.10086038 |