A Novel Throat Microphone Speech Enhancement Framework Based on Deep BLSTM Recurrent Neural Networks
Body-conducted microphone (BCM) speech is immune to noise but has the shortcomings such as severe loss of high-frequency components. Direct enhancement about BCM speech is meaningful but few works have been done so far. Firstly, considering the lack of open datasets, a specific dataset of throat mic...
Saved in:
Published in | 2018 IEEE 4th International Conference on Computer and Communications (ICCC) pp. 1258 - 1262 |
---|---|
Main Authors | , , , , |
Format | Conference Proceeding |
Language | English Japanese |
Published |
IEEE
01.12.2018
|
Subjects | |
Online Access | Get full text |
DOI | 10.1109/CompComm.2018.8780872 |
Cover
Summary: | Body-conducted microphone (BCM) speech is immune to noise but has the shortcomings such as severe loss of high-frequency components. Direct enhancement about BCM speech is meaningful but few works have been done so far. Firstly, considering the lack of open datasets, a specific dataset of throat microphone (TM) speech is constructed by us and now is opened online. Secondly, we propose a novel speaker-dependent TM speech enhancement framework based on deep bidirectional recurrent neural networks using Long Short-Term Memory units (BLSTM-RNN). In this framework, magnitude spectrums are directly transformed to achieve speech enhancement, which is different from previous works based on the source-filter model. BLSTM-RNN is deployed to further improve the results of transformation. Objective and subjective results show that the quality of TM speech is substantially improved, where PESQ, STOI and MOS scores are improved 0.71, 0.21 and 1.36 respectively. Another important criterion, LSD decreases 0.63. Another important criterion, LSD decreases 0.63. |
---|---|
DOI: | 10.1109/CompComm.2018.8780872 |