Assessing the Effectiveness of Vulnerability Detection via Prompt Tuning: An Empirical Study

In vulnerability detection approaches based on deep learning, fine-tuning with Pre-trained Language Models (PLMs) is a prevalent technique. Unfortunately, a natural gap exists between model pre-training tasks and vulnerability detection tasks due to different input formats, and the performance of fi...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings / Asia Pacific Software Engineering Conference pp. 415 - 424
Main Authors	Lu, Guilong, Ju, Xiaolin, Chen, Xiang, Yang, Shaoyu, Chen, Liang, Shen, Hao
Format	Conference Proceeding
Language	English
Published	IEEE 04.12.2023
Subjects	Cross-domain vulnerability detection Deep learning Prompt tuning Software engineering Task analysis Tuning Vulner-ability type detection Vulnerability detection
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In vulnerability detection approaches based on deep learning, fine-tuning with Pre-trained Language Models (PLMs) is a prevalent technique. Unfortunately, a natural gap exists between model pre-training tasks and vulnerability detection tasks due to different input formats, and the performance of fine-tuning relies on downstream dataset scales. Recently, prompt tuning has been used to alleviate these issues. However, it has not received enough attention in vulnerability detection. To assess the effectiveness of prompt tuning, we consider three classical vulnerability detection tasks: within-domain vulnerability detection, cross-domain vulnerability detection, and vulnerability type detection. Our empirical study considers three popular PLMs: CodeBERT, CodeT5, and CodeGPT. Then we use Devign, BigVul, and Reveal datasets as our experimental subjects. Our empirical results indicate that (1) compared to fine-tuning, prompt tuning can increase the accuracy of three tasks by an average of 42 %, 38%, and 41 %, respectively; (2) different prompt templates can have up to an 8 % impact on accuracy; (3) in data scarcity scenarios, the superiority of prompt tuning over fine-tuning is more obvious. Our research demonstrates that using prompt tuning can help to achieve better performance in vulnerability detection tasks and is a promising research direction in the future.
ISSN:	2640-0715
DOI:	10.1109/APSEC60848.2023.00052