Improvement of multi-parameter anomaly detection method: Addition of a relational token between parameters

In the continuous development of systems, the increasing volume and complexity of data that engineers must analyze have become significant challenges. To address this issue, extensive research has been conducted on automated anomaly detection in logs. However, due to the limited variety of available...

Full description

Saved in:
Bibliographic Details
Published inCognitive robotics Vol. 5; pp. 176 - 191
Main Authors Uchida, Hironori, Tominaga, Keitaro, Itai, Hideki, Li, Yujie, Nakatoh, Yoshihisa
Format Journal Article
LanguageEnglish
Published Elsevier B.V 2025
KeAi Communications Co. Ltd
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In the continuous development of systems, the increasing volume and complexity of data that engineers must analyze have become significant challenges. To address this issue, extensive research has been conducted on automated anomaly detection in logs. However, due to the limited variety of available datasets, most studies have focused on sequence-based anomalies in logs, with relatively little attention paid to parameter-based anomaly detection. To bridge this gap, we prepared a labeled dataset specifically designed for parameter-based anomaly detection and propose a novel method utilizing BERTMaskedLM. Since continuously changing logs in system development are difficult to label, we also propose a method that enables learning without labeled data. Previous studies have employed BERTMaskedLM to capture relationships between parameters in multi-parameter logs for anomaly detection. However, a known issue arises when the ranges of numerical parameters overlap, resulting in reduced detection accuracy. To mitigate this, we introduced tokens that encode the relationships between parameters, improving the independence of parameter combinations and enhancing anomaly detection accuracy (increasing the F1-score by more than 0.002). In this study, we employed a simple yet effective approach by using the total value of each token as the added token. Since only the parameter portions vary within the same log template structure, these proposed tokens effectively capture the relationships between parameters. Additionally, we visualized the influence of the added tokens and conducted experiments using a new dataset to assess the reliability of our proposed method.
ISSN:2667-2413
2667-2413
DOI:10.1016/j.cogr.2025.03.004