Semi-Supervised Log-Based Anomaly Detection via Probabilistic Label Estimation

With the growth of software systems, logs have become an important data to aid system maintenance. Log-based anomaly detection is one of the most important methods for such purpose, which aims to automatically detect system anomalies via log analysis. However, existing log-based anomaly detection ap...

Full description

Saved in:
Bibliographic Details
Published inProceedings / International Conference on Software Engineering pp. 1448 - 1460
Main Authors Yang, Lin, Chen, Junjie, Wang, Zan, Wang, Weijing, Jiang, Jiajun, Dong, Xuyuan, Zhang, Wenbin
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.05.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:With the growth of software systems, logs have become an important data to aid system maintenance. Log-based anomaly detection is one of the most important methods for such purpose, which aims to automatically detect system anomalies via log analysis. However, existing log-based anomaly detection approaches still suffer from practical issues due to either depending on a large amount of manually labeled training data (supervised approaches) or unsatisfactory performance without learning the knowledge on historical anomalies (unsupervised and semi-supervised approaches). In this paper, we propose a novel practical log-based anomaly detection approach, PLELog, which is semi-supervised to get rid of time-consuming manual labeling and incorporates the knowledge on historical anomalies via probabilistic label estimation to bring supervised approaches' superiority into play. In addition, PLELog is able to stay immune to unstable log data via semantic embedding and detect anomalies efficiently and effectively by designing an attention-based GRU neural network. We evaluated PLELog on two most widely-used public datasets, and the results demonstrate the effectiveness of PLELog, significantly outperforming the compared approaches with an average of 181.6% improvement in terms of F1-score. In particular, PLELog has been applied to two real-world systems from our university and a large corporation, further demonstrating its practicability
ISBN:1665402962
9781665402965
ISSN:1558-1225
DOI:10.1109/ICSE43902.2021.00130