Improving Speech Prosody of Audiobook Text-To-Speech Synthesis with Acoustic and Textual Contexts
Xin, Detai, Adavanne, Sharath, Ang, Federico, Kulkarni, Ashish, Takamichi, Shinnosuke, Saruwatari, Hiroshi
Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04.06.2023)
Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04.06.2023)
Get full text
Conference Proceeding
Duration-Aware Pause Insertion Using Pre-Trained Language Model for Multi-Speaker Text-To-Speech
Yang, Dong, Koriyama, Tomoki, Saito, Yuki, Saeki, Takaaki, Xin, Detai, Saruwatari, Hiroshi
Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04.06.2023)
Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04.06.2023)
Get full text
Conference Proceeding
MID-Attribute Speaker Generation Using Optimal-Transport-Based Interpolation of Gaussian Mixture Models
Watanabe, Aya, Takamichi, Shinnosuke, Saito, Yuki, Xin, Detai, Saruwatari, Hiroshi
Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04.06.2023)
Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04.06.2023)
Get full text
Conference Proceeding
JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions
Detai Xin, Jiang, Junfeng, Takamichi, Shinnosuke, Saito, Yuki, Aizawa, Akiko, Saruwatari, Hiroshi
Published in arXiv.org (09.10.2023)
Published in arXiv.org (09.10.2023)
Get full text
Paper
Journal Article
JNV corpus: A corpus of Japanese nonverbal vocalizations with diverse phrases and emotions
Xin, Detai, Takamichi, Shinnosuke, Saruwatari, Hiroshi
Published in Speech communication (01.01.2024)
Published in Speech communication (01.01.2024)
Get full text
Journal Article
Disentangled Speaker and Language Representations Using Mutual Information Minimization and Domain Adaptation for Cross-Lingual TTS
Xin, Detai, Komatsu, Tatsuya, Takamichi, Shinnosuke, Saruwatari, Hiroshi
Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (06.06.2021)
Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (06.06.2021)
Get full text
Conference Proceeding
Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus
Xin, Detai, Takamichi, Shinnosuke, Morimatsu, Ai, Saruwatari, Hiroshi
Year of Publication 21.05.2023
Year of Publication 21.05.2023
Get full text
Journal Article
Mid-attribute speaker generation using optimal-transport-based interpolation of Gaussian mixture models
Watanabe, Aya, Takamichi, Shinnosuke, Saito, Yuki, Xin, Detai, Saruwatari, Hiroshi
Year of Publication 18.10.2022
Year of Publication 18.10.2022
Get full text
Journal Article
Building speech corpus with diverse voice characteristics for its prompt-based representation
Watanabe, Aya, Takamichi, Shinnosuke, Saito, Yuki, Nakata, Wataru, Xin, Detai, Saruwatari, Hiroshi
Year of Publication 20.03.2024
Year of Publication 20.03.2024
Get full text
Journal Article
Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation
Xin, Detai, Takamichi, Shinnosuke, Okamoto, Takuma, Kawai, Hisashi, Saruwatari, Hiroshi
Year of Publication 22.04.2022
Year of Publication 22.04.2022
Get full text
Journal Article
Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control
Watanabe, Aya, Takamichi, Shinnosuke, Saito, Yuki, Nakata, Wataru, Xin, Detai, Saruwatari, Hiroshi
Year of Publication 23.09.2023
Year of Publication 23.09.2023
Get full text
Journal Article
How Generative Spoken Language Modeling Encodes Noisy Speech: Investigation from Phonetics to Syntactics
Park, Joonyong, Takamichi, Shinnosuke, Nakamura, Tomohiko, Seki, Kentaro, Xin, Detai, Saruwatari, Hiroshi
Year of Publication 01.06.2023
Year of Publication 01.06.2023
Get full text
Journal Article
Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
Yang, Dong, Koriyama, Tomoki, Saito, Yuki, Saeki, Takaaki, Xin, Detai, Saruwatari, Hiroshi
Year of Publication 27.02.2023
Year of Publication 27.02.2023
Get full text
Journal Article
Improving Speech Prosody of Audiobook Text-to-Speech Synthesis with Acoustic and Textual Contexts
Xin, Detai, Adavanne, Sharath, Ang, Federico, Kulkarni, Ashish, Takamichi, Shinnosuke, Saruwatari, Hiroshi
Year of Publication 04.11.2022
Year of Publication 04.11.2022
Get full text
Journal Article
RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis
Xin, Detai, Tan, Xu, Shen, Kai, Ju, Zeqian, Yang, Dongchao, Wang, Yuancheng, Takamichi, Shinnosuke, Saruwatari, Hiroshi, Liu, Shujie, Li, Jinyu, Zhao, Sheng
Year of Publication 04.04.2024
Year of Publication 04.04.2024
Get full text
Journal Article