GLMSnet: Single Channel Speech Separation Framework in Noisy and Reverberant Environments

In real noisy and reverberant environments, the performance of current single channel speech separation algorithms decreases significantly. Given this situation, this paper proposes a novel speech separation framework, called Graph convolution and Leading global Multi-scale separation network (GLMSn...

Full description

Saved in:

Bibliographic Details
Published in	2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) pp. 663 - 670
Main Authors	Shi, Huiyu, Chen, Xi, Kong, Tianlong, Yin, Shouyi, Ouyang, Peng
Format	Conference Proceeding
Language	English
Published	IEEE 13.12.2021
Subjects	Aggregates cock-tail party problem Conferences Convolution Noise measurement reverberation Speech enhancement Speech separation Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In real noisy and reverberant environments, the performance of current single channel speech separation algorithms decreases significantly. Given this situation, this paper proposes a novel speech separation framework, called Graph convolution and Leading global Multi-scale separation network (GLMSnet). The graph convolution network (GCN) is introduced on high-level features for modeling global context and incorporating long-range information, and it can be arbitrarily inserted into the desired position. Furthermore, Global multi-scale convolution is proposed to aggregate different levels features and improve the audio quality of separation. The leading factor is applied to increase valid information of target speech. We evaluate our method on WHAMR! Database. The results show that our proposed method can obtain state-of-the-art speech separation effect in the presence of noise and reverberation. Compared with the most advanced model before, the performance is improved by 22.7%.
DOI:	10.1109/ASRU51503.2021.9688217