C-SL: Contrastive Sound Localization with Inertial-Acoustic Sensors
Human brain employs perceptual information about the head and eye movements to update the spatial relationship between the individual and the surrounding environment. Based on this cognitive process known as spatial updating, we introduce contrastive sound localization (C-SL) with mobile inertial-ac...
Saved in:
Main Authors | , |
---|---|
Format | Journal Article |
Language | English |
Published |
09.06.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Human brain employs perceptual information about the head and eye movements
to update the spatial relationship between the individual and the surrounding
environment. Based on this cognitive process known as spatial updating, we
introduce contrastive sound localization (C-SL) with mobile inertial-acoustic
sensor arrays of arbitrary geometry. C-SL uses unlabeled multi-channel audio
recordings and inertial measurement unit (IMU) readings collected during free
rotational movements of the array to learn mappings from acoustical
measurements to an array-centered direction-of-arrival (DOA) in a
self-supervised manner. Contrary to conventional DOA estimation methods that
require the knowledge of either the array geometry or source locations in the
calibration stage, C-SL is agnostic to both, and can be trained on data
collected in minimally constrained settings. To achieve this capability, our
proposed method utilizes a customized contrastive loss measuring the spatial
contrast between source locations predicted for disjoint segments of the input
to jointly update estimated DOAs and the acoustic-spatial mapping in linear
time. We provide quantitative and qualitative evaluations of C-SL comparing its
performance with baseline DOA estimation methods in a wide range of conditions.
We believe the relaxed calibration process offered by C-SL paves the way toward
truly personalized augmented hearing applications. |
---|---|
DOI: | 10.48550/arxiv.2006.05071 |