Reinforcement Learning-Based Fuzz Testing for the Gazebo Robotic Simulator

Gazebo, being the most widely utilized simulator in robotics, plays a pivotal role in developing and testing robotic systems. Given its impact on the safety and reliability of robotic operations, early bug detection is critical. However, due to the challenges of strict input structures and vast stat...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the ACM on software engineering Vol. 2; no. ISSTA; pp. 1467 - 1488
Main Authors Ren, Zhilei, Li, Yitao, Li, Xiaochen, Qi, Guanxiao, Xuan, Jifeng, Jiang, He
Format Journal Article
LanguageEnglish
Published New York, NY, USA ACM 22.06.2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Gazebo, being the most widely utilized simulator in robotics, plays a pivotal role in developing and testing robotic systems. Given its impact on the safety and reliability of robotic operations, early bug detection is critical. However, due to the challenges of strict input structures and vast state space, it is not effective to directly use existing fuzz testing approach to Gazebo. In this paper, we present GzFuzz, the first fuzz testing framework designed for Gazebo. GzFuzz addresses these challenges through a syntax-aware feasible command generation mechanism to handle strict input requirements, and a reinforcement learning-based command generator selection mechanism to efficiently explore the state space. By combining the two mechanisms under a unified framework, GzFuzz is able to detect bugs in Gazebo effectively. In extensive experiments, GzFuzz is able to detect an average of 9.6 unique bugs in 12 hours, and exhibits a substantial increase in code coverage than existing fuzzers AFL++ and Fuzzotron, with a proportionate improvement of approximately 239%-363%. In less than six months, GzFuzz uncovered 25 unique crashes in Gazebo, 24 of which have been fixed or confirmed. Our results highlight the importance of directly fuzzing Gazebo, thereby presenting a novel and potent methodology that serves as an inspiration for enhancing testing across a broader range of simulators.
ISSN:2994-970X
2994-970X
DOI:10.1145/3728942