Neural network generative modeling to transform speech utterances and augment training data

Systems, methods, and devices for speech transformation and generating synthetic speech using deep generative models are disclosed. A method of the disclosure includes receiving input audio data comprising a plurality of iterations of a speech utterance from a plurality of speakers. The method inclu...

Full description

Saved in:
Bibliographic Details
Main Authors Burke, Ryan, Narayanan, Praveen, Micks, Ashley Elizabeth, Charette, Francois, Scaria, Lisa
Format Patent
LanguageEnglish
Published 02.03.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Systems, methods, and devices for speech transformation and generating synthetic speech using deep generative models are disclosed. A method of the disclosure includes receiving input audio data comprising a plurality of iterations of a speech utterance from a plurality of speakers. The method includes generating an input spectrogram based on the input audio data and transmitting the input spectrogram to a neural network configured to generate an output spectrogram. The method includes receiving the output spectrogram from the neural network and, based on the output spectrogram, generating synthetic audio data comprising the speech utterance.
Bibliography:Application Number: US201815940639