SURROUND SOUND TO IMMERSIVE AUDIO UPMIXING BASED ON VIDEO SCENE ANALYSIS

One embodiment provides a method of audio upmixing comprising performing video scene analysis by segmenting visual objects from video frames of a video, and performing audio analysis by extracting audio signals from an audio corresponding to the video. The method further comprises determining whethe...

Full description

Saved in:

Bibliographic Details
Main Authors	OCAMPO, Carlos Tejeda, DEVANTIER, Allan Otto, BHARITKAR, Sunil Ganpat, OH, Seongnam
Format	Patent
Language	English French
Published	13.06.2024
Subjects	DEAF-AID SETS ELECTRIC COMMUNICATION TECHNIQUE ELECTRICITY LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKEACOUSTIC ELECTROMECHANICAL TRANSDUCERS PICTORIAL COMMUNICATION, e.g. TELEVISION PUBLIC ADDRESS SYSTEMS
Online Access	Get full text

Cover

Loading…

More Information
Summary:	One embodiment provides a method of audio upmixing comprising performing video scene analysis by segmenting visual objects from video frames of a video, and performing audio analysis by extracting audio signals from an audio corresponding to the video. The method further comprises determining whether any of the audio signals correspond to any of the visual objects, and estimating a video-based trajectory of a visual object if the visual object is in motion and transitions from on-screen to off-screen, or vice versa, during the video. The method further comprises positioning an audio trajectory of an audio signal from at least one speaker associated with the display to at least one other speaker associated with providing surround sound. The audio trajectory is automatically matched with the video. The audio signal is delivered to the at least one speaker and the at least one other speaker for audio reproduction during the presentation. Un mode de réalisation concerne un procédé de mixage audio élévateur consistant à réaliser une analyse de scène vidéo par segmentation d'objets visuels à partir de trames vidéo d'une vidéo, et à réaliser une analyse audio par extraction de signaux audio à partir d'un audio correspondant à la vidéo. Le procédé consiste en outre à déterminer si l'un quelconque des signaux audio correspond à l'un quelconque des objets visuels, et à estimer une trajectoire basée sur vidéo d'un objet visuel si l'objet visuel est en mouvement et passe d'un état à l'écran à un état hors écran, ou inversement, pendant la vidéo. Le procédé consiste en outre à positionner une trajectoire audio d'un signal audio d'au moins un haut-parleur associé au dispositif d'affichage à au moins un autre haut-parleur associé à la fourniture d'un son ambiophonique. La trajectoire audio est automatiquement mise en correspondance avec la vidéo. Le signal audio est délivré audit au moins un haut-parleur et audit au moins un autre haut-parleur pour la reproduction audio pendant la présentation.
Bibliography:	Application Number: WO2023KR15705