Exploiting Deep Reinforcement Learning for Stochastic AoI Minimization in Multi-UAV-assisted Wireless Networks
In this paper, we consider a multiple unmanned aerial vehicles (UAVs)-assisted wireless sensing network, where low-power ground users (GUs) periodically sense the environmental information and upload the recent sensing information to a base station (BS). The GUs firstly backscatter their information...
Saved in:
Published in | 2024 IEEE Wireless Communications and Networking Conference (WCNC) pp. 1 - 6 |
---|---|
Main Authors | , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
21.04.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In this paper, we consider a multiple unmanned aerial vehicles (UAVs)-assisted wireless sensing network, where low-power ground users (GUs) periodically sense the environmental information and upload the recent sensing information to a base station (BS). The GUs firstly backscatter their information to the UAVs and then the UAVs transmit the information to the BS by the non-orthogonal multiple access (NOMA) transmissions. Our goal is to minimize the long-term age-of-information (AoI) by jointly optimizing the UAV's sensing scheduling, transmission control, and trajectories. To solve this problem, we propose the Lyapunov-driven hierarchical proximal policy optimization framework, named Lya-HPPO, to decouple the multi-stage AoI minimization problem into several control subproblems. In each control subproblem, the UAVs' sensing scheduling and transmission control are firstly determined by the outer-loop deep reinforcement learning (DRL) approach, and then the inner-loop optimization module is to update the UAVs' trajectories. Simulation results verify that the proposed Lya-HPPO framework converges very fast to a stable value and can make online decisions in real time, while guaranteeing the long-term data buffer and AoI stability. |
---|---|
ISSN: | 1558-2612 |
DOI: | 10.1109/WCNC57260.2024.10570857 |