Prompting Large Language Models for Malicious Webpage Detection

This work proposes a novel approach for malicious webpage detection by leveraging Large Language Models (LLMs). Unlike existing approaches that only analyze the Uniform Resource Locators (URLs) features, our approach considers the web contents for identifying malicious webpages. The major challenge...

Full description

Saved in:

Bibliographic Details
Published in	2023 IEEE 4th International Conference on Pattern Recognition and Machine Learning (PRML) pp. 393 - 400
Main Authors	Li, Lu, Gong, Bojie
Format	Conference Proceeding
Language	English
Published	IEEE 04.08.2023
Subjects	Analytical models Chatbots ChatGPT Deep learning in-context learning large language models malicious webpage detection Pattern recognition prompt learning Training Training data Uniform resource locators
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This work proposes a novel approach for malicious webpage detection by leveraging Large Language Models (LLMs). Unlike existing approaches that only analyze the Uniform Resource Locators (URLs) features, our approach considers the web contents for identifying malicious webpages. The major challenge is the lack of large-scale malicious analysis datasets with crawled web content for training previous data-driven models. To mitigate the challenge, we investigate prompting LLMs for the malicious webpage detection task, thus breaking the constraint of annotated training data. By using the popular GPT-3.5 and ChatGPT as our LLM engines, we study zero-shot and few-shot prompting methods to adapt those LLMs to perform malicious webpage detection. Experimental results show that our proposed approach achieves comparable or even better performance than deep learning baselines. Our analysis highlights the importance of integrating webpage content in detecting malicious URLs and demonstrates the feasibility of using LLMs to detect cybersecurity threats.
DOI:	10.1109/PRML59573.2023.10348229