Power Control Based on Deep Reinforcement Learning for Spectrum Sharing

In the current researches, artificial intelligence (AI) plays a crucial role in resource management for the next generation wireless communication network. However, traditional RL cannot solve the continuous and high dimensional problems. To handle these problems, the concept of deep neural network...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on wireless communications Vol. 19; no. 6; pp. 4209 - 4219
Main Authors	Zhang, Haijun, Yang, Ning, Huangfu, Wei, Long, Keping, Leung, Victor C. M.
Format	Journal Article
Language	English
Published	New York IEEE 01.06.2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Adaptive control Artificial intelligence Artificial neural networks Cognitive radio cognitive radio network Deep reinforcement learning (DRL) Energy conversion efficiency Interaction models Machine learning Neural networks Optimization Power control Quality of service Resource management Sensors spectrum sharing Wireless communication Wireless communications Wireless networks Wireless sensor networks
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In the current researches, artificial intelligence (AI) plays a crucial role in resource management for the next generation wireless communication network. However, traditional RL cannot solve the continuous and high dimensional problems. To handle these problems, the concept of deep neural network (DNN) is introduced into RL to solve high dimensional problems. In this paper, we first construct an information interaction model among primary user (PU), secondary user (SU) and wireless sensors in a cognitive radio system. In the model, the SU is unable to get the power allocation information of the PU, and needs to use the received signal strengths (RSSs) of the wireless sensors to adjust its own power. The PU allocates transmit power relying on its power control scheme. We propose an asynchronous advantage actor critic (A3C)-based power control of SU that is a parallel actor-learners framework with root mean square prop (RMSProp) optimization. Multiple SUs learn power control scheme simultaneously on different CPU threads, reducing neural network gradient update interdependence. To further improve the efficiency of spectrum sharing, the distributed proximal policy optimization (DPPO)-based power control is proposed which is an asynchronous variant of actor-critic with adaptive moment (Adam) optimization. It enables the network to converge quickly. After several power adjustments, the PU and the SU meet quality of service (QoS) requirements and achieve spectrum sharing.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1536-1276 1558-2248
DOI:	10.1109/TWC.2020.2981320