Analysis of Deep Learning Algorithm for Speech Separation: Implementation and Performance Evaluation

This research presents a novel deep learning-based method for accurately classifying speech sound sources in a given space. The proposed technique utilizes a carefully positioned microphone array to capture combined signals from various speakers, voices, and background noise. By treating speech sepa...

Full description

Saved in:

Bibliographic Details
Published in	2023 International Conference on Network, Multimedia and Information Technology (NMITCON) pp. 1 - 6
Main Authors	Udawant, Krushna S., Mali, Swapnil G., Mahajan, Shrinivas P.
Format	Conference Proceeding
Language	English
Published	IEEE 01.09.2023
Subjects	Analysis Architecture CNN Deep Learning DNN Feature extraction Multimedia communication Noise measurement Performance evaluation Recording Speech Separation Supervised learning Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This research presents a novel deep learning-based method for accurately classifying speech sound sources in a given space. The proposed technique utilizes a carefully positioned microphone array to capture combined signals from various speakers, voices, and background noise. By treating speech separation as a supervised learning problem, the system is trained to identify and separate the original signals from the mixed audio. To achieve this, a cutting-edge sound separation architecture is developed, and an extensive database is curated to facilitate deep neural network model training, testing, and comparison. Themain objective is to demonstrate the efficacy of the deep neural network-based technique in effectively separating the original speech signals. The research explores different learning strategies and training goals, considering the acoustic characteristics of the audio signals. Various separation algorithms, including multimi-crophone techniques, monophonic approaches for voice augmentation, and speech de-reverberation, are thoroughly investigated. The experimental outcomes reveal that the proposed framework outperforms Convolutional neural network models in terms of separation performance. The study provides valuable insightsinto model construction and evaluation, significantly advancing deep learning-based voice separation algorithms.
DOI:	10.1109/NMITCON58196.2023.10276200