Deep Learning Approaches for Enhanced Audio Quality Through Noise Reduction
In various applications like telecommunications, multimedia, speech recognition, and voice interfaces, background noise can significantly degrade audio quality and intelligibility. Traditional noise reduction methods such as spectral subtraction and Wiener filtering have limitations in handling comp...
Saved in:
Published in | 2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE) pp. 447 - 453 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
09.05.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In various applications like telecommunications, multimedia, speech recognition, and voice interfaces, background noise can significantly degrade audio quality and intelligibility. Traditional noise reduction methods such as spectral subtraction and Wiener filtering have limitations in handling complex, dynamic noise environments[1]. This research explores the potential of AI and machine learning techniques, especially deep learning architectures like convolutional neural networks (CNNs), autoencoders, and generative adversarial networks (GANs)[3], for effective background noise reduction in audio files. The paper provides an overview of existing research, highlighting strengths and limitations of traditional methods versus potential advantages of AI/ML. Approaches. The methodology outlines data acquisition, preprocessing, feature extraction, and the proposed deep learning algorithms and architectures. Objective metrics (signal-to-noise ratio, perceptual evaluation of speech quality)[24] and subjective listening tests evaluate performance. Results include quantitative and qualitative analyses, comparative studies with state-of-the-art methods[1], and exploration of real-world applications across telecommunications, multimedia, speech recognition, and voice interfaces. Key findings, limitations, and future research directions like handling complex noise environments, improving computational efficiency, multi-modal/multi-task learning, and integrating domain knowledge are discussed. |
---|---|
DOI: | 10.1109/IC3SE62002.2024.10593073 |