AUTOMATIC LEVELING OF SPEECH CONTENT

Embodiments are disclosed for automatic leveling of speech content. In an embodiment, a method comprises: receiving, using one or more processors, frames of an audio recording including speech and non-speech content; for each frame: determining, using the one or more processors, a speech probability...

Full description

Saved in:
Bibliographic Details
Main Authors YEH, Chunghsin, CENGARLE, Giulio, DE BURGH, Mark David
Format Patent
LanguageEnglish
French
German
Published 28.08.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Embodiments are disclosed for automatic leveling of speech content. In an embodiment, a method comprises: receiving, using one or more processors, frames of an audio recording including speech and non-speech content; for each frame: determining, using the one or more processors, a speech probability; analyzing, using the one or more processors, a perceptual loudness of the frame; obtaining, using the one or more processors, a target loudness range for the frame; computing, using the one or more processors, gains to apply to the frame based on the target loudness range and the perceptual loudness analysis, where the gains include dynamic gains that change frame-by-frame and that are scaled based on the speech probability; and applying the gains to the frame so that a resulting loudness range of the speech content in the audio recording fits within the target loudness range.
Bibliography:Application Number: EP20210719785