자기 회귀 생성 신경망을 사용한 오디오 생성

오디오 신호의 예측을 생성하기 위한 컴퓨터 저장 매체에 인코딩된 컴퓨터 프로그램을 포함하는 방법, 시스템 및 장치. 방법 중 하나는 오디오 신호를 생성하라는 요청을 수신하는 단계; 오디오 신호의 의미론적 표현을 획득하는 단계; 하나 이상의 생성 신경망을 사용하고 적어도 의미론적 표현에 대해 조정된 오디오 신호의 음향 표현을 생성하는 단계; 및 오디오 신호의 예측을 생성하기 위해 디코더 신경망을 사용하여 적어도 음향 표현을 처리하는 단계를 포함한다. Methods, systems, and apparatus, including comp...

Full description

Saved in:

Bibliographic Details
Main Authors	VERZETTI MAURO, DENK TIMO IMMANUEL, FRANK CHRISTIAN, BORSOS ZALAN, ZEGHIDOUR NEIL, TEBOUL OLIVIER, MARINIER RAPHAEL, CAILLON ANTOINE, ROBERTS ADAM JOSEPH, AGOSTINELLI ANDREA, SHARIFI MATTHEW, TAGLIASACCHI MARCO, ENGEL JESSE, GRANGIER DAVID
Format	Patent
Language	Korean
Published	27.06.2024
Subjects	ACOUSTICS CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online Access	Get full text

Cover

Loading…

More Information
Summary:	오디오 신호의 예측을 생성하기 위한 컴퓨터 저장 매체에 인코딩된 컴퓨터 프로그램을 포함하는 방법, 시스템 및 장치. 방법 중 하나는 오디오 신호를 생성하라는 요청을 수신하는 단계; 오디오 신호의 의미론적 표현을 획득하는 단계; 하나 이상의 생성 신경망을 사용하고 적어도 의미론적 표현에 대해 조정된 오디오 신호의 음향 표현을 생성하는 단계; 및 오디오 신호의 예측을 생성하기 위해 디코더 신경망을 사용하여 적어도 음향 표현을 처리하는 단계를 포함한다. Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a prediction of an audio signal. One of the methods includes receiving a request to generate an audio signal conditioned on an input; processing the input using an embedding neural network to map the input to one or more embedding tokens; generating a semantic representation of the audio signal; generating, using one or more generative neural networks and conditioned on at least the semantic representation and the embedding tokens, an acoustic representation of the audio signal; and processing at least the acoustic representation using a decoder neural network to generate the prediction of the audio signal.
Bibliography:	Application Number: KR20247017365