Deep Learning Approaches for Enhanced Audio Quality Through Noise Reduction

In various applications like telecommunications, multimedia, speech recognition, and voice interfaces, background noise can significantly degrade audio quality and intelligibility. Traditional noise reduction methods such as spectral subtraction and Wiener filtering have limitations in handling comp...

Full description

Saved in:
Bibliographic Details
Published in2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE) pp. 447 - 453
Main Authors Lohani, Bhanu, Gautam, Chhavi Krishan, Kushwaha, Pradeep Kumar, Gupta, Amardeep
Format Conference Proceeding
LanguageEnglish
Published IEEE 09.05.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In various applications like telecommunications, multimedia, speech recognition, and voice interfaces, background noise can significantly degrade audio quality and intelligibility. Traditional noise reduction methods such as spectral subtraction and Wiener filtering have limitations in handling complex, dynamic noise environments[1]. This research explores the potential of AI and machine learning techniques, especially deep learning architectures like convolutional neural networks (CNNs), autoencoders, and generative adversarial networks (GANs)[3], for effective background noise reduction in audio files. The paper provides an overview of existing research, highlighting strengths and limitations of traditional methods versus potential advantages of AI/ML. Approaches. The methodology outlines data acquisition, preprocessing, feature extraction, and the proposed deep learning algorithms and architectures. Objective metrics (signal-to-noise ratio, perceptual evaluation of speech quality)[24] and subjective listening tests evaluate performance. Results include quantitative and qualitative analyses, comparative studies with state-of-the-art methods[1], and exploration of real-world applications across telecommunications, multimedia, speech recognition, and voice interfaces. Key findings, limitations, and future research directions like handling complex noise environments, improving computational efficiency, multi-modal/multi-task learning, and integrating domain knowledge are discussed.
DOI:10.1109/IC3SE62002.2024.10593073