AIDetectVul: Software Vulnerability Detection Method Based on Feature Fusion of Pre-trained Models

Data-driven deep learning models are constrained by the scale and diversity of training data, making them vulnerable to data bias. While large language models (LLMs) exhibit superior generalization in vulnerability detection, their low inference efficiency and high computational costs hinder practic...

Full description

Saved in:

Bibliographic Details
Published in	2025 5th International Conference on Consumer Electronics and Computer Engineering (ICCECE) pp. 258 - 263
Main Authors	Xue, Shiying, Li, Lin, Li, Tao, Chen, Haodong, Li, Jiapan, Qin, Yangqing
Format	Conference Proceeding
Language	English
Published	IEEE 28.02.2025
Subjects	Accuracy Codes Computational modeling Computer architecture Computing Power Facilities Feature extraction Feature Fusion Generalization Large language models Pretrained Model Training Training data Transformers Vectors Vulnerability Detection
Online Access	Get full text
DOI	10.1109/ICCECE65250.2025.10985370

Cover

Loading…

More Information
Summary:	Data-driven deep learning models are constrained by the scale and diversity of training data, making them vulnerable to data bias. While large language models (LLMs) exhibit superior generalization in vulnerability detection, their low inference efficiency and high computational costs hinder practical deployment in industrial settings. To address these limitations, we propose AIDetectVul, a novel vulnerability detection framework leveraging feature fusion from pre-trained models. Our approach concurrently utilizes encoder-only and decoder-only architectures to extract complementary code embeddings, with feature fusion enhancing semantic diversity. These enriched representations are then processed by a Transformer model, where the self-attention mechanism effectively captures long-range code dependencies, ultimately improving both detection accuracy and generalization capability. Comprehensive evaluations on proprietary enterprise datasets and open-source benchmarks demonstrate that AIDetectVul achieves comparable detection accuracy to the state-of-the-art LineVul model while demonstrating measurable improvements in generalization performance. Compared to LLM-based approaches, our solution maintains significantly lower computational overhead and training costs, making it particularly suitable for industrial applications.
DOI:	10.1109/ICCECE65250.2025.10985370