ChatHTTPFuzz: large language model-assisted IoT HTTP fuzzing

Internet of Things (IoT) devices offer convenience through web interfaces, web VPNs, and other web-based services, all relying on the HTTP protocol. However, these externally exposed HTTP services present significant security risks. Although fuzzing has shown some effectiveness in identifying vulner...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of machine learning and cybernetics Vol. 16; no. 7-8; pp. 4577 - 4598
Main Authors Yang, Zhe, Peng, Hao, Jiang, Yanling, Li, Xingwei, Du, Haohua, Wang, Shuhai, Liu, Jianwei
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.08.2025
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN1868-8071
1868-808X
DOI10.1007/s13042-024-02527-3

Cover

Loading…
More Information
Summary:Internet of Things (IoT) devices offer convenience through web interfaces, web VPNs, and other web-based services, all relying on the HTTP protocol. However, these externally exposed HTTP services present significant security risks. Although fuzzing has shown some effectiveness in identifying vulnerabilities in IoT HTTP services, most state-of-the-art tools still rely on random mutation strategies, leading to difficulties in accurately understanding the HTTP protocol’s structure and generating many invalid test cases. Furthermore, These fuzzers rely on a limited set of initial seeds for testing. While this approach initiates testing, the limited number and diversity of seeds hinder comprehensive coverage of complex scenarios in IoT HTTP services. In this paper, we investigate and find that large language models (LLMs) excel in parsing HTTP protocol data and analyzing code logic. Based on these findings, we propose a novel LLM-guided IoT HTTP fuzzing method, ChatHTTPFuzz, which automatically parses protocol fields and analyzes service code logic to generate protocol-compliant test cases. Specifically, we use LLMs to label fields in HTTP protocol data, creating seed templates. Second, The LLM analyzes service code to guide the generation of additional packets aligned with the code logic, enriching the seed templates and their field values. Finally, we design an enhanced Thompson sampling algorithm based on the exploration balance factor and mutation potential factor to schedule seed templates. We evaluate ChatHTTPFuzz on 16 different real-world IoT devices. It finds more vulnerabilities than SNIPUZZ, BOOFUZZ, and MUTINY. ChatHTTPFuzz has discovered 116 vulnerabilities, of which 70 are unique, and 23 have been assigned CVEs.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1868-8071
1868-808X
DOI:10.1007/s13042-024-02527-3