Anomaly detection in virtual machine logs against irrelevant attribute interference

Virtual machine logs are generated in large quantities. Virtual machine logs may contain some abnormal logs that indicate security risks or system failures of the virtual machine platform. Therefore, using unsupervised anomaly detection methods to identify abnormal logs is a meaningful task. However...

Full description

Saved in:

Bibliographic Details
Published in	PloS one Vol. 20; no. 1; p. e0315897
Main Authors	Zhang, Hao, Zhou, Yun, Xu, Huahu, Shi, Jiangang, Lin, Xinhua, Gao, Yiqin
Format	Journal Article
Language	English
Published	United States Public Library of Science 07.01.2025 Public Library of Science (PLoS)
Subjects	Accuracy Algorithms Analysis Anomalies Automation Biology and Life Sciences Clustering Computer and Information Sciences Debugging Engineering and Technology Error messages Feature extraction Humans Identification methods Information processing Long short-term memory Machine learning Methods Operations management Pareto optimum Parsing algorithms Physical Sciences Research and Analysis Methods Robustness Social Sciences Statistical methods Support Vector Machine Support vector machines System failures Virtual computer systems Virtual environments China
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Virtual machine logs are generated in large quantities. Virtual machine logs may contain some abnormal logs that indicate security risks or system failures of the virtual machine platform. Therefore, using unsupervised anomaly detection methods to identify abnormal logs is a meaningful task. However, collecting accurate anomaly logs in the real world is often challenging, and there is inherent noise in the log information. Parsing logs and anomaly alerts can be time-consuming, making it important to improve their effectiveness and accuracy. To address these challenges, this paper proposes a method called LADSVM(Long Short-Term Memory + Autoencoder-Decoder + SVM). Firstly, the log parsing algorithm is used to parse the logs. Then, the feature extraction algorithm, which combines Long Short-Term Memory and Autoencoder-Decoder, is applied to extract features. Autoencoder-Decoder reduces the dimensionality of the data by mapping the high-dimensional input to a low-dimensional latent space. This helps eliminate redundant information and noise, extract key features, and increase robustness. Finally, the Support Vector Machine is utilized to detect different feature vector signals. Experimental results demonstrate that compared to traditional methods, this approach is capable of learning better features without any prior knowledge, while also exhibiting superior noise robustness and performance. The LADSVM approach excels at detecting anomalies in virtual machine logs characterized by strong sequential patterns and noise. However, its performance may vary when applied to disordered log data. This highlights the necessity of carefully selecting detection methods that align with the specific characteristics of different log data types.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Competing Interests: NO authors have competing interests Enter: The authors have declared that no competing interests exist.
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0315897