Deep-Reinforcement-Learning-Based Spectrum Resource Management for Industrial Internet of Things

The Industrial Internet of Things (IIoT) has attracted tremendous interest from both industry and academia as it can significantly improve production efficiency and system intelligence. However, with the explosive growth of various types of user equipment (UE) and data flow, IIoT experiences spectru...

Full description

Saved in:

Bibliographic Details
Published in	IEEE internet of things journal Vol. 8; no. 5; pp. 3476 - 3489
Main Authors	Shi, Zhaoyuan, Xie, Xianzhong, Lu, Huabing, Yang, Helin, Kadoch, Michel, Cheriet, Mohamed
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.03.2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Access control Algorithms Cost effectiveness Deep learning Deep reinforcement learning (DRL) Frame structures Heuristic algorithms Industrial applications Industrial Internet of Things (IIoT) Industries Internet of Things learning efficiency priority experience replay Resource management Sensors spectrum resource management Wireless communication Wireless sensor networks
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The Industrial Internet of Things (IIoT) has attracted tremendous interest from both industry and academia as it can significantly improve production efficiency and system intelligence. However, with the explosive growth of various types of user equipment (UE) and data flow, IIoT experiences spectrum resource scarcity for wireless applications. In this article, we propose a solution for spectrum resource management for the IIoT network, with the objective of facilitating the limited spectrum sharing between different kinds of UEs. To overcome the challenges of unknown dynamic IIoT environments, a modified deep <inline-formula> <tex-math notation="LaTeX">Q </tex-math></inline-formula>-learning network (MDQN) is developed. Considering the cost effectiveness of IIoT devices, the base station (BS) acts as a single agent and centrally manages the spectrum resources, which can be executed without coordination or exchange between UEs. In this article, we first built a realistic IIoT model and design a simple medium access control (MAC) frame structure to facilitate the environment state observation. Then, a new reward function is designed to drive the learning process, which takes into account the different communication requirements of various types of UEs. In addition, to improve the learning efficiency, we compress the action space and propose a priority experience replay strategy based on decreasing temporal difference (TD) error. Finally, simulation results show that the proposed algorithm can successfully achieve dynamic spectrum resource management in the IIoT network. Compared with other algorithms, it can achieve superior network performance with a faster convergence rate.
ISSN:	2327-4662 2327-4662
DOI:	10.1109/JIOT.2020.3022861