Why Do Angular Margin Losses Work Well for Semi-Supervised Anomalous Sound Detection?

State-of-the-art anomalous sound detection systems often utilize angular margin losses to learn suitable representations of acoustic data using an auxiliary task, which usually is a supervised or self-supervised classification task. The underlying idea is that, in order to solve this auxiliary task,...

Full description

Saved in:

Bibliographic Details
Published in	IEEE/ACM transactions on audio, speech, and language processing Vol. 32; pp. 1 - 15
Main Authors	Wilkinghoff, Kevin, Kurth, Frank
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.01.2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	angular margin loss anomaly detection Classification compactness loss Condition monitoring Data models domain generalization explainable artificial intelligence machine listening Monitoring Noise measurement Performance evaluation representation learning Representations Speech processing Task analysis Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	State-of-the-art anomalous sound detection systems often utilize angular margin losses to learn suitable representations of acoustic data using an auxiliary task, which usually is a supervised or self-supervised classification task. The underlying idea is that, in order to solve this auxiliary task, specific information about normal data needs to be captured in the learned representations and that this information is also sufficient to differentiate between normal and anomalous samples. Especially in noisy conditions, discriminative models based on angular margin losses tend to significantly outperform systems based on generative or one-class models. The goal of this work is to investigate why using angular margin losses with auxiliary tasks works well for detecting anomalous sounds. To this end, it is shown, both theoretically and experimentally, that minimizing angular margin losses also minimizes compactness loss while inherently preventing learning trivial solutions. Furthermore, multiple experiments are conducted to show that using a related classification task as an auxiliary task teaches the model to learn representations suitable for detecting anomalous sounds in noisy conditions. Among these experiments are performance evaluations, visualizing the embedding space with t-SNE and visualizing the input representations with respect to the anomaly score using randomized input sampling for explanation.
ISSN:	2329-9290 2329-9304
DOI:	10.1109/TASLP.2023.3337153