Multi-Agent Imitation Learning for Pervasive Edge Computing: A Decentralized Computation Offloading Algorithm

Pervasive edge computing refers to one kind of edge computing that merely relies on edge devices with sensing, storage and communication abilities to realize peer-to-peer offloading without centralized management. Due to lack of unified coordination, users always pursue profits by maximizing their o...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on parallel and distributed systems Vol. 32; no. 2; pp. 411 - 425
Main Authors	Wang, Xiaojie, Ning, Zhaolong, Guo, Song
Format	Journal Article
Language	English
Published	New York IEEE 01.02.2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Cloud computing Completion time Computation offloading Computational modeling decentralized execution Edge computing Game theory Games imitation learning Machine learning Multiagent systems Performance evaluation Pervasive edge computing Processor scheduling Task analysis Utilities
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Pervasive edge computing refers to one kind of edge computing that merely relies on edge devices with sensing, storage and communication abilities to realize peer-to-peer offloading without centralized management. Due to lack of unified coordination, users always pursue profits by maximizing their own utilities. However, on one hand, users may not make appropriate scheduling decisions based on their local observations. On the other hand, how to guarantee the fairness among different edge devices in the fully decentralized environment is rather challenging. To solve the above issues, we propose a decentrailized computation offloading algorithm with the purpose of minimizing average task completion time in the pervasive edge computing networks. We first derive a Nash equilibrium among devices by stochastic game theories based on the full observations of system states. After that, we design a traffic offloading algorithm based on partial observations by integrating general adversarial imitation learning. Multiple experts can provide demonstrations, so that devices can mimic the behaviors of corresponding experts by minimizing the gaps between the distributions of their observation-action pairs. At last, theoretical and performance results show that our solution has a significant advantage compared with other representative algorithms.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1045-9219 1558-2183
DOI:	10.1109/TPDS.2020.3023936