Spectrum Sharing in Vehicular Networks Based on Multi-Agent Reinforcement Learning

This paper investigates the spectrum sharing problem in vehicular networks based on multi-agent reinforcement learning, where multiple vehicle-to-vehicle (V2V) links reuse the frequency spectrum preoccupied by vehicle-to-infrastructure (V2I) links. Fast channel variations in high mobility vehicular...

Full description

Saved in:

Bibliographic Details
Published in	IEEE journal on selected areas in communications Vol. 37; no. 10; pp. 2282 - 2292
Main Authors	Liang, Le, Ye, Hao, Li, Geoffrey Ye
Format	Journal Article
Language	English
Published	New York IEEE 01.10.2019 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	3GPP Ad hoc networks distributed spectrum access Fading channels Frequency spectrum Links Machine learning multi-agent reinforcement learning Multiagent systems Power management Reinforcement learning Reliability Resource management Safety spectrum and power allocation Vehicle-to-everything Vehicle-to-infrastructure Vehicular networks
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper investigates the spectrum sharing problem in vehicular networks based on multi-agent reinforcement learning, where multiple vehicle-to-vehicle (V2V) links reuse the frequency spectrum preoccupied by vehicle-to-infrastructure (V2I) links. Fast channel variations in high mobility vehicular environments preclude the possibility of collecting accurate instantaneous channel state information at the base station for centralized resource management. In response, we model the resource sharing as a multi-agent reinforcement learning problem, which is then solved using a fingerprint-based deep Q-network method that is amenable to a distributed implementation. The V2V links, each acting as an agent, collectively interact with the communication environment, receive distinctive observations yet a common reward, and learn to improve spectrum and power allocation through updating Q-networks using the gained experiences. We demonstrate that with a proper reward design and training mechanism, the multiple V2V agents successfully learn to cooperate in a distributed way to simultaneously improve the sum capacity of V2I links and payload delivery rate of V2V links.
ISSN:	0733-8716 1558-0008
DOI:	10.1109/JSAC.2019.2933962