A Novel Throat Microphone Speech Enhancement Framework Based on Deep BLSTM Recurrent Neural Networks

Body-conducted microphone (BCM) speech is immune to noise but has the shortcomings such as severe loss of high-frequency components. Direct enhancement about BCM speech is meaningful but few works have been done so far. Firstly, considering the lack of open datasets, a specific dataset of throat mic...

Full description

Saved in:
Bibliographic Details
Published in2018 IEEE 4th International Conference on Computer and Communications (ICCC) pp. 1258 - 1262
Main Authors Zheng, Changyan, Zhang, Xiongwei, Sun, Meng, Yang, Jibin, Xing, Yibo
Format Conference Proceeding
LanguageEnglish
Japanese
Published IEEE 01.12.2018
Subjects
Online AccessGet full text
DOI10.1109/CompComm.2018.8780872

Cover

More Information
Summary:Body-conducted microphone (BCM) speech is immune to noise but has the shortcomings such as severe loss of high-frequency components. Direct enhancement about BCM speech is meaningful but few works have been done so far. Firstly, considering the lack of open datasets, a specific dataset of throat microphone (TM) speech is constructed by us and now is opened online. Secondly, we propose a novel speaker-dependent TM speech enhancement framework based on deep bidirectional recurrent neural networks using Long Short-Term Memory units (BLSTM-RNN). In this framework, magnitude spectrums are directly transformed to achieve speech enhancement, which is different from previous works based on the source-filter model. BLSTM-RNN is deployed to further improve the results of transformation. Objective and subjective results show that the quality of TM speech is substantially improved, where PESQ, STOI and MOS scores are improved 0.71, 0.21 and 1.36 respectively. Another important criterion, LSD decreases 0.63. Another important criterion, LSD decreases 0.63.
DOI:10.1109/CompComm.2018.8780872