GENERATING AUDIO USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a prediction of an audio signal. One of the methods includes receiving a request to generate an audio signal; obtaining a semantic representation of the audio signal; generating, using one...

Full description

Saved in:

Bibliographic Details
Main Authors	TAGLIASACCHI, Marco, DENK, Timo Immanuel, ZEGHIDOUR, Neil, ENGEL, Jesse, BORSOS, Zalán, AGOSTINELLI, Andrea, SHARIFI, Matthew, GRANGIER, David, MARINIER, Raphaël, TEBOUL, Olivier, CAILLON, Antoine, FRANK, Christian, VERZETTI, Mauro, ROBERTS, Adam Joseph
Format	Patent
Language	English French
Published	14.03.2024
Subjects	ACOUSTICS CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a prediction of an audio signal. One of the methods includes receiving a request to generate an audio signal; obtaining a semantic representation of the audio signal; generating, using one or more generative neural networks and conditioned on at least the semantic representation, an acoustic representation of the audio signal; and processing at least the acoustic representation using a decoder neural network to generate the prediction of the audio signal. L'invention concerne des procédés, des systèmes et un appareil, y compris des programmes d'ordinateur codés sur des supports de stockage d'ordinateur, pour générer une prédiction d'un signal audio. L'un des procédés comprend la réception d'une demande de génération d'un signal audio ; l'obtention d'une représentation sémantique du signal audio ; la génération, à l'aide d'un ou de plusieurs réseaux neuronaux génératifs et conditionnés sur au moins la représentation sémantique, une représentation acoustique du signal audio ; et le traitement d'au moins la représentation acoustique à l'aide d'un réseau neuronal de décodeur pour générer la prédiction du signal audio.
Bibliography:	Application Number: WO2023US32168