Fast Adaptive Task Offloading in Edge Computing Based on Meta Reinforcement Learning

Multi-access edge computing (MEC) aims to extend cloud service to the network edge to reduce network traffic and service latency. A fundamental problem in MEC is how to efficiently offload heterogeneous tasks of mobile applications from user equipment (UE) to MEC hosts. Recently, many deep reinforce...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on parallel and distributed systems Vol. 32; no. 1; pp. 242 - 253
Main Authors Wang, Jin, Hu, Jia, Min, Geyong, Zomaya, Albert Y., Georgalas, Nektarios
Format Journal Article
LanguageEnglish
Published New York IEEE 01.01.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Multi-access edge computing (MEC) aims to extend cloud service to the network edge to reduce network traffic and service latency. A fundamental problem in MEC is how to efficiently offload heterogeneous tasks of mobile applications from user equipment (UE) to MEC hosts. Recently, many deep reinforcement learning (DRL)-based methods have been proposed to learn offloading policies through interacting with the MEC environment that consists of UE, wireless channels, and MEC hosts. However, these methods have weak adaptability to new environments because they have low sample efficiency and need full retraining to learn updated policies for new environments. To overcome this weakness, we propose a task offloading method based on meta reinforcement learning, which can adapt fast to new environments with a small number of gradient updates and samples. We model mobile applications as Directed Acyclic Graphs (DAGs) and the offloading policy by a custom sequence-to-sequence (seq2seq) neural network. To efficiently train the seq2seq network, we propose a method that synergizes the first order approximation and clipped surrogate objective. The experimental results demonstrate that this new offloading method can reduce the latency by up to 25 percent compared to three baselines while being able to adapt fast to new environments.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1045-9219
1558-2183
DOI:10.1109/TPDS.2020.3014896