Optimization of Head-related Transfer Function (HRTF) Models
This paper presents a deep-learning technique for modeling Head-related Transfer Function (HRTF) using a large data-set from IRCAM [7] of 49 subjects with 115 directions/subject. We use a log-spectral distortion metric that we have used to optimize the performance of the deep-learning model. The pre...
Saved in:
Published in | IEEE International Conference on Consumer Electronics-Berlin pp. 251 - 256 |
---|---|
Main Author | |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
08.09.2019
|
Online Access | Get full text |
Cover
Loading…
Summary: | This paper presents a deep-learning technique for modeling Head-related Transfer Function (HRTF) using a large data-set from IRCAM [7] of 49 subjects with 115 directions/subject. We use a log-spectral distortion metric that we have used to optimize the performance of the deep-learning model. The presented model first uses a sparse autoencoder (AE) for capturing the salient representation (peaks, and notches for elevation) from the user-captured HRTFs, in its latent representation, and then uses a generalized regression neural network (GRNN) to approximate the latent representation of the corresponding HRTFs at arbitrary angles. The sparse AE then decodes the GRNN output to reconstruct the HRTF. The proposed approach is shown to outperform, objectively using the log-spectral distortion, the state-of-the-art linear-least-squares optimum solution (viz., derived from Principal Component Analysis) to perform better generalization over the HRTFs in the IRCAM dataset. Towards this, Bayesian optimization has been performed to tune the hyper-parameters of the sparse AE and the GRNN. |
---|---|
ISSN: | 2166-6822 |
DOI: | 10.1109/ICCE-Berlin47944.2019.8966196 |