Multi-Agent Reinforcement Learning-Based Pilot Assignment for Cell-Free Massive MIMO Systems

Cell-free massive multiple-input multiple-output (CF-mMIMO) has been considered as one of the potential technologies for beyond-5G and 6G to meet the demand for higher data capacity and uniform service rate for user equipment. However, reusing the same pilot signals by several users, owing to limite...

Full description

Saved in:

Bibliographic Details
Published in	IEEE access Vol. 10; pp. 120492 - 120502
Main Authors	Rahmani, Mostafa, Dehghani, Mohammad Javad, Xiao, Pei, Bashar, Manijeh, Debbah, Merouane
Format	Journal Article
Language	English
Published	Piscataway IEEE 2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Antennas Cell-free massive MIMO Contamination Data communication deep reinforcement learning Fading channels Interference Machine learning MIMO communication Multiagent systems pilot assignment pilot contamination Reinforcement learning Spectral efficiency Uplink Wireless communication
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Cell-free massive multiple-input multiple-output (CF-mMIMO) has been considered as one of the potential technologies for beyond-5G and 6G to meet the demand for higher data capacity and uniform service rate for user equipment. However, reusing the same pilot signals by several users, owing to limited pilot resources, can result in the so-called pilot contamination problem, which can prevent CF-mMIMO from unlocking its full performance potential. It is challenging to employ classical pilot assignment (PA) methods to serve many users simultaneously with low complexity; therefore, a scalable and distributed PA scheme is required. In this paper, we utilize a learning-based approach to handle the pilot contamination problem by formulating PA as a multi-agent static game, developing a two-level hierarchical learning algorithm to mitigate the effects of pilot contamination, and presenting an efficient yet scalable PA strategy. We first model a PA problem as a static multi-agent game with P teams (agents), in which each team is represented by a specific pilot. We then define a multi-agent structure that can automatically determine the most appropriate PA policy in a distributed manner. The numerical results demonstrate that the proposed PA algorithm outperforms previous suboptimal algorithms in terms of the per-user spectral efficiency (SE). In particular, the proposed approach can increase the average SE and 95%-likely SE by approximately 2.2% and 3.3%, respectively, compared to the best state-of-the-art solution.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2022.3221935