GENERATING AUDIO USING NEURAL NETWORKS

A method of training a neural network system to generate an output sequence of audio data that comprises a respective audio sample at each of a plurality of time steps. One of the methods includes processing a training sequence of audio data using a convolutional subnetwork. The convolutional subnet...

Full description

Saved in:

Bibliographic Details
Main Authors	KALCHBRENNER, Nal Emmerich, VINYALS, Oriol, VAN DEN OORD, Aaron Gerard Antonius, DIELEMAN, Sander Etienne Lea, SIMONYAN, Karen
Format	Patent
Language	English French German
Published	02.11.2022
Subjects	ACOUSTICS CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online Access	Get full text

Cover

Loading…

More Information
Summary:	A method of training a neural network system to generate an output sequence of audio data that comprises a respective audio sample at each of a plurality of time steps. One of the methods includes processing a training sequence of audio data using a convolutional subnetwork. The convolutional subnetwork is configured to, for each of the time steps, receive a current sequence of audio data that comprises an audio sample at each time step that precedes the time step in the training sequence, and process the current sequence of audio data to generate an alternative representation for the time step. The alternative representation for the time step is used to generate an output that defines a score distribution over a plurality of possible audio samples for the time step. The neural network system is trained using supervised learning based on the input audio samples and the output score distribution.
Bibliography:	Application Number: EP20200195353