Human-Scene Network: A novel baseline with self-rectifying loss for weakly supervised video anomaly detection

Video anomaly detection in surveillance systems with only video-level labels (i.e. weakly supervised) is challenging. This is due to (i) the complex integration of a large variety of scenarios including human and scene-based anomalies characterized by subtle or sharp spatio-temporal cues in real-wor...

Full description

Saved in:
Bibliographic Details
Published inComputer vision and image understanding Vol. 241; p. 103955
Main Authors Majhi, Snehashis, Dai, Rui, Kong, Quan, Garattoni, Lorenzo, Francesca, Gianpiero, Brémond, François
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.04.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Video anomaly detection in surveillance systems with only video-level labels (i.e. weakly supervised) is challenging. This is due to (i) the complex integration of a large variety of scenarios including human and scene-based anomalies characterized by subtle or sharp spatio-temporal cues in real-world videos and (ii) non-optimal optimization between normal and anomaly instances under weak supervision. In this paper, we propose a Human-Scene Network to learn discriminative representations by capturing both subtle and strong cues in a dissociative manner. In addition, a self-rectifying loss is proposed that dynamically computes the pseudo-temporal annotations from video-level labels for optimizing the Human-Scene Network effectively. The proposed Human-Scene Network optimized with self-rectifying loss is validated on three publicly available datasets i.e. UCF-Crime, ShanghaiTech, and IITB-Corridor, outperforming recently reported state-of-the-art approaches on five out of the six scenarios considered. •A Human-Scene Network to detect human and scene centric divergent video anomalies.•An effective and salient feature combination strategy in decoupled sub-networks.•A self-rectifying loss for better separability among instances in weak-supervision.•The results outperform benchmark methods on many scenarios considered.
ISSN:1077-3142
1090-235X
DOI:10.1016/j.cviu.2024.103955