基于机器学习的日志函数自动识别方法

随着软件规模的不断增长,日志在故障检测中发挥着愈加重要的作用。然而,目前软件日志缺乏统一标准,常受开发人员个人习惯影响,为大规模系统中日志的自动化分析带来了挑战。其中,日志函数的识别作为日志分析的前提条件,对分析结果有着直接影响。提出了一种基于机器学习的方法以支持日志自动识别。通过系统分析广泛使用的大规模开源软件,总结出日志函数编写的主要形式,并提取不同形式间的共性特征,进而基于机器学习实现了自动日志识别工具iLog。实验显示,使用iLog识别的日志函数能力平均为使用特定关键字的76倍,十折交叉验证得到iLog的分析结果的F-Score为0.93。...

Full description

Saved in:
Bibliographic Details
Published in计算机工程与科学 Vol. 39; no. 1; pp. 111 - 117
Main Author 贾周阳 廖湘科 刘晓东 李姗姗 周书林 谢欣伟
Format Journal Article
LanguageChinese
Published 国防科学技术大学计算机学院,湖南长沙,410073 2017
Subjects
Online AccessGet full text
ISSN1007-130X
DOI10.3969/j.issn.1007-130X.2017.01.015

Cover

Loading…
More Information
Summary:随着软件规模的不断增长,日志在故障检测中发挥着愈加重要的作用。然而,目前软件日志缺乏统一标准,常受开发人员个人习惯影响,为大规模系统中日志的自动化分析带来了挑战。其中,日志函数的识别作为日志分析的前提条件,对分析结果有着直接影响。提出了一种基于机器学习的方法以支持日志自动识别。通过系统分析广泛使用的大规模开源软件,总结出日志函数编写的主要形式,并提取不同形式间的共性特征,进而基于机器学习实现了自动日志识别工具iLog。实验显示,使用iLog识别的日志函数能力平均为使用特定关键字的76倍,十折交叉验证得到iLog的分析结果的F-Score为0.93。
Bibliography:JIA Zhou-yang, LIAO Xiang-ke, LIU Xiao-dong, LI Shan-shan, ZHOU Shu-lin, XIE Xin-wei (College of Computer, National University of Defense Technology,Changsha 410073,China)
43-1258/TP
With software scaling up continuously, logging mechanism has become an indispensable part in failure diagnosis area. A pretty similar symptom may be caused by various software bugs, and the most obvious evidence is always logging messages. Meanwhile, the development of most pieces of large-scale software is affected by developers' personal habits rather than being guided by certain conventional specification, so log-related analysis suffers in large-scale software. The recognition of logging function plays a precondition role in log analysis and affects the results of log analysis directly. We propose a machine learning method to fill the gap that logging function recognition has not been paid attention by most existing log-related works. Learning from widely-used software, we summary three logging functions, extract five common fe
ISSN:1007-130X
DOI:10.3969/j.issn.1007-130X.2017.01.015