A Network Packet Analysis Method to Discover Malicious Activities

With the development of networks and the increase in the number of network devices, the number of cyber attacks targeting them is also increasing. Since these cyber-attacks aim to steal important information and destroy systems, it is necessary to minimize social and economic damage through early de...

Full description

Saved in:

Bibliographic Details
Published in	Journal of information science theory and practice Vol. 10; no. special; pp. 143 - 153
Main Authors	Kwon, Taewoong, Lee, Jun, Song, Jungsuk, Myung, Joonwoo, Kim, Kyu-il
Format	Journal Article
Language	English
Published	Daejeon Korea Institute of Science and Technology Information 01.06.2022 한국과학기술정보연구원
Subjects	Artificial intelligence data preprocessing machine learning natural language processing Security management security monitoring 문헌정보학
Online Access	Get full text

Cover

Loading…

More Information
Summary:	With the development of networks and the increase in the number of network devices, the number of cyber attacks targeting them is also increasing. Since these cyber-attacks aim to steal important information and destroy systems, it is necessary to minimize social and economic damage through early detection and rapid response. Many studies using machine learning (ML) and artificial intelligence (AI) have been conducted, among which payload learning is one of the most intuitive and effective methods to detect malicious behavior. In this study, we propose a preprocessing method to maximize the performance of the model when learning the payload in term units. The proposed method constructs a high-quality learning data set by eliminating unnecessary noise (stopwords) and preserving important features in consideration of the machine language and natural language characteristics of the packet payload. Our method consists of three steps: Preserving significant special characters, Generating a stopword list, and Class label refinement. By processing packets of various and complex structures based on these three processes, it is possible to make high-quality training data that can be helpful to build high-performance ML/AI models for security monitoring. We prove the effectiveness of the proposed method by comparing the performance of the AI model to which the proposed method is applied and not. Forthermore, by evaluating the performance of the AI model applied proposed method in the real-world Security Operating Center (SOC) environment with live network traffic, we demonstrate the applicability of the our method to the real environment.
Bibliography:	http://data.doi.or.kr/10.1633/JISTaP.2022.10.S.14
ISSN:	2287-9099 2287-4577
DOI:	10.1633/JISTaP.2022.10.S.14