DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement

Multi-frame algorithms for single-channel speech enhancement are able to take advantage from short-time correlations within the speech signal. Deep Filtering (DF) was proposed to directly estimate a complex filter in frequency domain to take advantage of these correlations. In this work, we present...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Schröter, Hendrik, Rosenkranz, Tobias, Escalante-B, Alberto N, Maier, Andreas
Format	Paper
Language	English
Published	Ithaca Cornell University Library, arXiv.org 14.05.2023
Subjects	Algorithms Real time Speech processing Time factors
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Multi-frame algorithms for single-channel speech enhancement are able to take advantage from short-time correlations within the speech signal. Deep Filtering (DF) was proposed to directly estimate a complex filter in frequency domain to take advantage of these correlations. In this work, we present a real-time speech enhancement demo using DeepFilterNet. DeepFilterNet's efficiency is enabled by exploiting domain knowledge of speech production and psychoacoustic perception. Our model is able to match state-of-the-art speech enhancement benchmarks while achieving a real-time-factor of 0.19 on a single threaded notebook CPU. The framework as well as pretrained weights have been published under an open source license.
ISSN:	2331-8422