A Configurable Intrinsic Curiosity Module for a Testbed for Developing Intelligent Swarm UAVs

This paper introduces an Intrinsic Curiosity Module (ICM) based Reinforcement Learning (RL) framework for swarm Unmanned Aerial Vehicles (UAVs) target tracking, leveraging the actor–critic architecture to control the roll, pitch, yaw, and throttle motions of UAVs. A key challenge in RL-based UAV coo...

Full description

Saved in:
Bibliographic Details
Published inMachine learning with applications Vol. 21; p. 100714
Main Authors Mahmood, Jawad, Raja, Muhammad Adil, Loane, John, McCaffery, Fergal
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.09.2025
Elsevier
Subjects
Online AccessGet full text
ISSN2666-8270
2666-8270
DOI10.1016/j.mlwa.2025.100714

Cover

Loading…
More Information
Summary:This paper introduces an Intrinsic Curiosity Module (ICM) based Reinforcement Learning (RL) framework for swarm Unmanned Aerial Vehicles (UAVs) target tracking, leveraging the actor–critic architecture to control the roll, pitch, yaw, and throttle motions of UAVs. A key challenge in RL-based UAV coordination is the delayed reward problem, which hinders effective learning in dynamic environments. Existing UAV testbeds rely primarily on extrinsic rewards and lack mechanisms for adaptive exploration and efficient UAV coordination. To address these limitations, we propose a testbed that integrates an ICM with the Asynchronous Advantage Actor-Critic (A3C) algorithm for tracking UAVs. It incorporates the Self-Reflective Curiosity-Weighted (SRCW) hyperparameter tuning mechanism for the ICM, which adaptively modifies hyperparameters based on the ongoing RL agent’s performance. In this testbed, the target UAV is guided by the Advantage Actor-Critic (A2C) model, while a swarm of two tracking UAVs is controlled by using the A3C-ICM approach. The proposed framework facilitates real-time autonomous coordination among UAVs within a simulated environment. This system is developed using the FlightGear flight simulator and the JSBSim Flight Dynamics Model (FDM), which enables dynamic simulations and continuous interaction between UAVs. Experimental results demonstrate that the tracking UAVs can effectively coordinate and maintain precise paths even under complex conditions.
ISSN:2666-8270
2666-8270
DOI:10.1016/j.mlwa.2025.100714