Deep-Reinforcement-Learning-Based Mode Selection and Resource Allocation for Cellular V2X Communications

Cellular vehicle-to-everything (V2X) communication is crucial to support future diverse vehicular applications. However, for safety-critical applications, unstable vehicle-to-vehicle (V2V) links, and high signaling overhead of centralized resource allocation approaches become bottlenecks. In this ar...

Full description

Saved in:

Bibliographic Details
Published in	IEEE internet of things journal Vol. 7; no. 7; pp. 6380 - 6391
Main Authors	Zhang, Xinran, Peng, Mugen, Yan, Shi, Sun, Yaohua
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.07.2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Cellular communication Cellular vehicle-to-everything (V2X) communications Clustering Clustering algorithms Computer simulation deep reinforcement learning (DRL) Graph theory Interference Machine learning Markov processes Modal choice mode selection Optimization Quality of service Reinforcement learning Reliability Reliability engineering Resource allocation Resource management Safety critical Time Vehicle-to-everything
Online Access	Get full text
ISSN	2327-4662 2327-4662
DOI	10.1109/JIOT.2019.2962715

Cover

More Information
Summary:	Cellular vehicle-to-everything (V2X) communication is crucial to support future diverse vehicular applications. However, for safety-critical applications, unstable vehicle-to-vehicle (V2V) links, and high signaling overhead of centralized resource allocation approaches become bottlenecks. In this article, we investigate a joint optimization problem of transmission mode selection and resource allocation for cellular V2X communications. In particular, the problem is formulated as a Markov decision process, and a deep reinforcement learning (DRL)-based decentralized algorithm is proposed to maximize the sum capacity of vehicle-to-infrastructure users while meeting the latency and reliability requirements of V2V pairs. Moreover, considering training limitation of local DRL models, a two-timescale federated DRL algorithm is developed to help obtain robust models. Wherein, the graph theory-based vehicle clustering algorithm is executed on a large timescale and in turn, the federated learning algorithm is conducted on a small timescale. The simulation results show that the proposed DRL-based algorithm outperforms other decentralized baselines, and validate the superiority of the two-timescale federated DRL algorithm for newly activated V2V pairs.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2327-4662 2327-4662
DOI:	10.1109/JIOT.2019.2962715