Deep Reinforcement Learning for Robust Beamforming in IRS-assisted Wireless Communications

Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver. In this paper, we minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's pas...

Full description

Saved in:

Bibliographic Details
Published in	IEEE Global Communications Conference (Online) pp. 1 - 6
Main Authors	Lin, Jiaye, Zout, Yuze, Dong, Xiaoru, Gong, Shimin, Hoang, Dinh Thai, Niyato, Dusit
Format	Conference Proceeding
Language	English
Published	IEEE 01.12.2020
Subjects	Approximation algorithms Array signal processing Channel estimation Receivers Reinforcement learning Scattering Wireless communication
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver. In this paper, we minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming. Due to uncertain channel conditions, we formulate a robust power minimization problem subject to the receiver's signal-to-noise ratio (SNR) requirement and the IRS's power budget constraint. We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences. To improve the learning performance, we derive a convex approximation as a lower bound on the robust problem, which is integrated with the DRL framework and thus promoting a novel optimization-driven deep deterministic policy gradient (DDPG) approach. In particular, when the DDPG algorithm generates a part of the action (e.g., passive beamforming), we can use the model-based convex approximation to optimize the other part of the action (e.g., active beamforming) efficiently. Our simulation results demonstrate that the optimization-driven DDPG algorithm can improve both the learning rate and reward significantly compared to the conventional DDPG algorithm.
ISSN:	2576-6813
DOI:	10.1109/GLOBECOM42002.2020.9322372