A Configurable Intrinsic Curiosity Module for a Testbed for Developing Intelligent Swarm UAVs
This paper introduces an Intrinsic Curiosity Module (ICM) based Reinforcement Learning (RL) framework for swarm Unmanned Aerial Vehicles (UAVs) target tracking, leveraging the actor–critic architecture to control the roll, pitch, yaw, and throttle motions of UAVs. A key challenge in RL-based UAV coo...
Saved in:
Published in | Machine learning with applications Vol. 21; p. 100714 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
01.09.2025
Elsevier |
Subjects | |
Online Access | Get full text |
ISSN | 2666-8270 2666-8270 |
DOI | 10.1016/j.mlwa.2025.100714 |
Cover
Loading…
Summary: | This paper introduces an Intrinsic Curiosity Module (ICM) based Reinforcement Learning (RL) framework for swarm Unmanned Aerial Vehicles (UAVs) target tracking, leveraging the actor–critic architecture to control the roll, pitch, yaw, and throttle motions of UAVs. A key challenge in RL-based UAV coordination is the delayed reward problem, which hinders effective learning in dynamic environments. Existing UAV testbeds rely primarily on extrinsic rewards and lack mechanisms for adaptive exploration and efficient UAV coordination. To address these limitations, we propose a testbed that integrates an ICM with the Asynchronous Advantage Actor-Critic (A3C) algorithm for tracking UAVs. It incorporates the Self-Reflective Curiosity-Weighted (SRCW) hyperparameter tuning mechanism for the ICM, which adaptively modifies hyperparameters based on the ongoing RL agent’s performance. In this testbed, the target UAV is guided by the Advantage Actor-Critic (A2C) model, while a swarm of two tracking UAVs is controlled by using the A3C-ICM approach. The proposed framework facilitates real-time autonomous coordination among UAVs within a simulated environment. This system is developed using the FlightGear flight simulator and the JSBSim Flight Dynamics Model (FDM), which enables dynamic simulations and continuous interaction between UAVs. Experimental results demonstrate that the tracking UAVs can effectively coordinate and maintain precise paths even under complex conditions. |
---|---|
ISSN: | 2666-8270 2666-8270 |
DOI: | 10.1016/j.mlwa.2025.100714 |