Prompting Large Language Models for Malicious Webpage Detection
This work proposes a novel approach for malicious webpage detection by leveraging Large Language Models (LLMs). Unlike existing approaches that only analyze the Uniform Resource Locators (URLs) features, our approach considers the web contents for identifying malicious webpages. The major challenge...
Saved in:
Published in | 2023 IEEE 4th International Conference on Pattern Recognition and Machine Learning (PRML) pp. 393 - 400 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
04.08.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | This work proposes a novel approach for malicious webpage detection by leveraging Large Language Models (LLMs). Unlike existing approaches that only analyze the Uniform Resource Locators (URLs) features, our approach considers the web contents for identifying malicious webpages. The major challenge is the lack of large-scale malicious analysis datasets with crawled web content for training previous data-driven models. To mitigate the challenge, we investigate prompting LLMs for the malicious webpage detection task, thus breaking the constraint of annotated training data. By using the popular GPT-3.5 and ChatGPT as our LLM engines, we study zero-shot and few-shot prompting methods to adapt those LLMs to perform malicious webpage detection. Experimental results show that our proposed approach achieves comparable or even better performance than deep learning baselines. Our analysis highlights the importance of integrating webpage content in detecting malicious URLs and demonstrates the feasibility of using LLMs to detect cybersecurity threats. |
---|---|
DOI: | 10.1109/PRML59573.2023.10348229 |