Separation Structure Speech Separation Model Fusing Conformer and NBC
Abstract The quality of speech separation affects the entire speech technology ecosystem. Aiming at the problems of low utilization of local feature information, insufficient convergence speed, too many calculation parameters and too long calculation time in blind source separation in view of Transf...
Saved in:
Published in | Journal of physics. Conference series Vol. 2384; no. 1; pp. 12034 - 12040 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Bristol
IOP Publishing
01.12.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Abstract
The quality of speech separation affects the entire speech technology ecosystem. Aiming at the problems of low utilization of local feature information, insufficient convergence speed, too many calculation parameters and too long calculation time in blind source separation in view of Transformer dual-path cyclic neural network, a blind source separation model based on fusion of Conformer and NBC (Narrow-band Conformer NBC) is proposed. First, for the problems of low utilization of speech local feature information and insufficient convergence speed, the Transformer block is replaced by the Conformer in the block of the dual-path recurrent network. It can improve the utilization and convergence speed of local features. Secondly, the NBC block is used to replace the Transformer block in the inter-block loop. The NBC block simplifies the calculation of vector similarity and vector aggregation, reduces lots of parameters and calculation cost, and reduces the computational complexity of the model. In the experiments on WSJ0-2mix
[7]
dataset and WHAM
[13]
dataset, contrast with other models, the convergence speech is faster and the blind source separation effect is better. |
---|---|
ISSN: | 1742-6588 1742-6596 |
DOI: | 10.1088/1742-6596/2384/1/012034 |