GLMSnet: Single Channel Speech Separation Framework in Noisy and Reverberant Environments

In real noisy and reverberant environments, the performance of current single channel speech separation algorithms decreases significantly. Given this situation, this paper proposes a novel speech separation framework, called Graph convolution and Leading global Multi-scale separation network (GLMSn...

Full description

Saved in:
Bibliographic Details
Published in2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) pp. 663 - 670
Main Authors Shi, Huiyu, Chen, Xi, Kong, Tianlong, Yin, Shouyi, Ouyang, Peng
Format Conference Proceeding
LanguageEnglish
Published IEEE 13.12.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In real noisy and reverberant environments, the performance of current single channel speech separation algorithms decreases significantly. Given this situation, this paper proposes a novel speech separation framework, called Graph convolution and Leading global Multi-scale separation network (GLMSnet). The graph convolution network (GCN) is introduced on high-level features for modeling global context and incorporating long-range information, and it can be arbitrarily inserted into the desired position. Furthermore, Global multi-scale convolution is proposed to aggregate different levels features and improve the audio quality of separation. The leading factor is applied to increase valid information of target speech. We evaluate our method on WHAMR! Database. The results show that our proposed method can obtain state-of-the-art speech separation effect in the presence of noise and reverberation. Compared with the most advanced model before, the performance is improved by 22.7%.
DOI:10.1109/ASRU51503.2021.9688217