Deep Reinforcement Learning based Contract Incentive for UAVs and Energy Harvest Assisted Computing

In this paper, we consider a mobile edge computing (MEC) system with multiple unmanned aerial vehicles (UAVs) and stochastic energy harvesting. The UAVs' mobility can help data offloading over a larger geographical area containing multi- hotspots (HSs). If HSs have offloading requests, the disp...

Full description

Saved in:

Bibliographic Details
Published in	GLOBECOM 2022 - 2022 IEEE Global Communications Conference pp. 2224 - 2229
Main Authors	Chen, Che, Gong, Shimin, Zhang, Wenjie, Zheng, Yifeng, Kiat, Yeo Chai
Format	Conference Proceeding
Language	English
Published	IEEE 04.12.2022
Subjects	Deep learning Energy harvesting Multi-access edge computing Reinforcement learning Simulation Stability analysis Wireless communication
Online Access	Get full text
DOI	10.1109/GLOBECOM48099.2022.10001311

Cover

More Information
Summary:	In this paper, we consider a mobile edge computing (MEC) system with multiple unmanned aerial vehicles (UAVs) and stochastic energy harvesting. The UAVs' mobility can help data offloading over a larger geographical area containing multi- hotspots (HSs). If HSs have offloading requests, the dispatch agent (DA) can recruit different types of UAVs to fly close to HSs and help computation. We aim to maximize the long-term utility of all HSs, subject to the stability of energy queue. The proposed problem is a joint optimization problem of offloading strategy and contract design in a dynamic setting over time. We design a deep reinforcement learning based contract incentive (DRLCI) strategy that solves the joint optimization problem in two steps. Firstly, we use an improved deep Q-network (DQN) algorithm to obtain the offloading decision. Secondly, to motivate UAVs to participate in resources sharing, a contract has been designed for asymmetric information scenarios, and Lagrangian multiplier method has been utilized to approach the optimal contract. Simulation results show the feasibility and efficiency of the proposed strategy. It can achieve a very close-to the performance obtained by complete information scenario.
DOI:	10.1109/GLOBECOM48099.2022.10001311