Deep Deterministic Policy Gradient with Prioritized Sampling for Power Control
Reinforcement learning is a technique for power control in wireless communications. However, most research has focused on the deep Q-network (DQN) scheme, which outputs the Q-value for each discrete action, and does not match the continuous power control problem. Hence, this paper provides a deep de...
Saved in:
Published in | IEEE access Vol. 8; p. 1 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
01.01.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Reinforcement learning is a technique for power control in wireless communications. However, most research has focused on the deep Q-network (DQN) scheme, which outputs the Q-value for each discrete action, and does not match the continuous power control problem. Hence, this paper provides a deep deterministic policy gradient (DDPG) scheme for power control. A power selection policy designated an actor is approximated by a convolutional neural network (CNN), and an evaluation of a policy designated a critic is approximated by a fully connected network. These deep neural networks enable fast decision-making for large-scale power control problems. Moreover, to speed up the training process, this paper proposes a prioritized sampling technique, which samples the experiences that need to be learned with a higher probability. This paper simulates the proposed algorithm in a multiple sweep interference (MSI) scenario. The simulation results show that the DDPG scheme is more likely to achieve optimal policy than the DQN scheme. In addition, the DDPG scheme with prioritized sampling (DDPG-PS) converges faster than the DDPG scheme with uniform sampling. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2020.3033333 |