EAD-Conformer: a Conformer-Based Encoder-Attention-Decoder-Network for Multi-Task Audio Source Separation

In this paper, we propose a Conformer-based network to improve the performance of multi-task audio source separation. This network, named EAD-Conformer, employs Conformer blocks to capture both local and global information, and an encoder-attention-decoder manner encourages the network to perform at...

Full description

Saved in:
Bibliographic Details
Published inICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 521 - 525
Main Authors Li, Chenxing, Wang, Yang, Deng, Feng, Zhang, Zhuo, Wang, Xiaorui, Wang, Zhongyuan
Format Conference Proceeding
LanguageEnglish
Published IEEE 23.05.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this paper, we propose a Conformer-based network to improve the performance of multi-task audio source separation. This network, named EAD-Conformer, employs Conformer blocks to capture both local and global information, and an encoder-attention-decoder manner encourages the network to perform attentive modeling based on different sources. Specifically, EAD-Conformer first parses out the feature representations from the mixture by a Conformer-based encoder. Then, an attention module extracts selective information for each track and bridges encoder and decoders. Finally, three decoders respectively process attentive features and generate output masks for different sources. In addition, the proposed discriminate loss further enlarges the distance between different sources. Experiments demonstrate the effectiveness of EAD-Conformer, which achieves 13.37 dB, 11.41 dB, 10.56 dB signal-to-distortion ratio improvement on speech, music, noise track, respectively, and shows advantages over several well-known models.
ISSN:2379-190X
DOI:10.1109/ICASSP43922.2022.9747830