Research on Buffer Overflow Vulnerability Mining Method Based on Deep Learning

Buffer overflow vulnerability is the most widespread and destructive vulnerability in software security, the existing mining methods need to manually analyze the program code based on preset rules, the workload is large and the efficiency and accuracy are relatively low. To mine buffer overflow vuln...

Full description

Saved in:
Bibliographic Details
Published in2024 2nd International Conference on Big Data and Privacy Computing (BDPC) pp. 28 - 36
Main Authors Yongxu, Hou, Ying, Zhou, Pengzhi, Xu, Zhen, Guo
Format Conference Proceeding
LanguageEnglish
Published IEEE 10.01.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Buffer overflow vulnerability is the most widespread and destructive vulnerability in software security, the existing mining methods need to manually analyze the program code based on preset rules, the workload is large and the efficiency and accuracy are relatively low. To mine buffer overflow vulnerabilities more efficiently and accurately, in this paper, we design a model for buffer overflow vulnerability mining at the level of program source code using deep learning techniques. According to the common sensitive functions that are prone to buffer overflow vulnerabilities, we obtain the potentially vulnerable code from the source code to generate a vulnerability code block, and then based on the parsed code attribute maps analyze the buffer overflow sensitive function call mode, for different function call modes, different slicing methods will be used to construct the vulnerability code block into code slices. The code slices are mapped into numeric vectors by the Word2vec word vector model. The buffer overflow vulnerability mining model for the Bi-GRU-Attn network was optimized using the Particle Swarm Algorithm, Keepy Method, and Attention Mechanism and the model was trained and tested using the CWE-119 (Buffer Errors) dataset from the benchmark dataset SARD. In this paper, we have conducted comparative experiments in five aspects: precision rate, recall rate, false alarm rate, F1 value, and UC value, and the results show that the proposed model can mine buffer overflow vulnerabilities effectively and has better results in improving precision rate and reducing false alarm rate.
DOI:10.1109/BDPC59998.2024.10649293