Speech enhancement based on nonnegative matrix factorization in constant-Q frequency domain
•We propose to use constant Q transform to perform speech enhancement.•We use NMF and SMNF methods to perform speech enhancement.•Our proposed CQT method gives high resolution for the low frequencies.•Our proposed CQT method shows better enhancement ability, especially in low SNR.•Comparing with the...
Saved in:
Published in | Applied acoustics Vol. 174; p. 107732 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
01.03.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | •We propose to use constant Q transform to perform speech enhancement.•We use NMF and SMNF methods to perform speech enhancement.•Our proposed CQT method gives high resolution for the low frequencies.•Our proposed CQT method shows better enhancement ability, especially in low SNR.•Comparing with the NMF algorithm, the enhanced effect of SNMF algorithm is better.
The utterance can be easily affected by additive noise in a real environment. To decrease the additive noise, the noisy speech can be enhanced based on the spectrogram following with Nonnegative Matrix Factorization (NMF) and sparse NMF(SNMF) algorithm. More information can be obtained at a high sampling rate. The range of objective human vocal organs is limited to a low-frequency value compared to the high sampling rate; thus, higher resolution is required to describe the low frequencies. Traditional spectrogram based on short-time Fourier transform (STFT) may lack frequency resolution at lower frequencies. To this end, we propose to use a constant Q transform (CQT) in this paper, which can give high resolution for the low frequencies. The backend algorithm remains the NMF/SNMF algorithm. We evaluate the proposed method with the Perceptual Evaluation of Speech Quality (PESQ) and Short-Time Objective Intelligibility (STOI). The experimental results show that our proposed method shows better enhancement ability compared to the STFT baseline at low Signal to Noise Ratio (SNR). |
---|---|
ISSN: | 0003-682X |
DOI: | 10.1016/j.apacoust.2020.107732 |