Integrating prediction and reconstruction for anomaly detection

•A novel framework is proposed for anomaly detection in videos.•The innovation lies in the combination of prediction and reconstruction methods.•This work is more robust to noise and suitable for real-world surveillance videos.•This work outperforms both prediction (baseline) and reconstruction appr...

Full description

Saved in:
Bibliographic Details
Published inPattern recognition letters Vol. 129; pp. 123 - 130
Main Authors Tang, Yao, Zhao, Lin, Zhang, Shanshan, Gong, Chen, Li, Guangyu, Yang, Jian
Format Journal Article
LanguageEnglish
Published Amsterdam Elsevier B.V 01.01.2020
Elsevier Science Ltd
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•A novel framework is proposed for anomaly detection in videos.•The innovation lies in the combination of prediction and reconstruction methods.•This work is more robust to noise and suitable for real-world surveillance videos.•This work outperforms both prediction (baseline) and reconstruction approaches. Anomaly detection in videos refers to identifying events that rarely or shouldn’t happen in a certain context. Among all existing methods, the idea of reconstruction or future frame prediction is predominant for detecting anomalies. Reconstruction methods try to minimize the reconstruction errors of training data, but cannot guarantee large reconstruction errors for abnormal events. Future frame prediction methods follow the concept that normal events are predictable while abnormal ones are unpredictable. However, the results may drop rapidly since prediction is not robust to the noise in real-world surveillance videos. In this paper, we propose an approach that combines the advantages and balances the disadvantages of these two methods. An end-to-end network is designed to conduct future frame prediction and reconstruction sequentially. Future frame prediction makes the reconstruction errors large enough to facilitate the identification of abnormal events, while reconstruction helps enhance the predicted future frames from normal events. Specifically, we connect two U-Net blocks in the generator. One block works in the form of frame prediction, and the other tries to reconstruct the frames generated by the former block. Experiments over several benchmark datasets demonstrate the superiority of our method over previous state-of-the-art approaches, while running in real-time at 30 frames per second.
ISSN:0167-8655
1872-7344
DOI:10.1016/j.patrec.2019.11.024