Optimization of Head-related Transfer Function (HRTF) Models

This paper presents a deep-learning technique for modeling Head-related Transfer Function (HRTF) using a large data-set from IRCAM [7] of 49 subjects with 115 directions/subject. We use a log-spectral distortion metric that we have used to optimize the performance of the deep-learning model. The pre...

Full description

Saved in:
Bibliographic Details
Published inIEEE International Conference on Consumer Electronics-Berlin pp. 251 - 256
Main Author Bharitkar, Sunil
Format Conference Proceeding
LanguageEnglish
Published IEEE 08.09.2019
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper presents a deep-learning technique for modeling Head-related Transfer Function (HRTF) using a large data-set from IRCAM [7] of 49 subjects with 115 directions/subject. We use a log-spectral distortion metric that we have used to optimize the performance of the deep-learning model. The presented model first uses a sparse autoencoder (AE) for capturing the salient representation (peaks, and notches for elevation) from the user-captured HRTFs, in its latent representation, and then uses a generalized regression neural network (GRNN) to approximate the latent representation of the corresponding HRTFs at arbitrary angles. The sparse AE then decodes the GRNN output to reconstruct the HRTF. The proposed approach is shown to outperform, objectively using the log-spectral distortion, the state-of-the-art linear-least-squares optimum solution (viz., derived from Principal Component Analysis) to perform better generalization over the HRTFs in the IRCAM dataset. Towards this, Bayesian optimization has been performed to tune the hyper-parameters of the sparse AE and the GRNN.
ISSN:2166-6822
DOI:10.1109/ICCE-Berlin47944.2019.8966196