Multi-Agent Reinforcement Learning-Based Pilot Assignment for Cell-Free Massive MIMO Systems

Cell-free massive multiple-input multiple-output (CF-mMIMO) has been considered as one of the potential technologies for beyond-5G and 6G to meet the demand for higher data capacity and uniform service rate for user equipment. However, reusing the same pilot signals by several users, owing to limite...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 10; pp. 120492 - 120502
Main Authors Rahmani, Mostafa, Dehghani, Mohammad Javad, Xiao, Pei, Bashar, Manijeh, Debbah, Merouane
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Cell-free massive multiple-input multiple-output (CF-mMIMO) has been considered as one of the potential technologies for beyond-5G and 6G to meet the demand for higher data capacity and uniform service rate for user equipment. However, reusing the same pilot signals by several users, owing to limited pilot resources, can result in the so-called pilot contamination problem, which can prevent CF-mMIMO from unlocking its full performance potential. It is challenging to employ classical pilot assignment (PA) methods to serve many users simultaneously with low complexity; therefore, a scalable and distributed PA scheme is required. In this paper, we utilize a learning-based approach to handle the pilot contamination problem by formulating PA as a multi-agent static game, developing a two-level hierarchical learning algorithm to mitigate the effects of pilot contamination, and presenting an efficient yet scalable PA strategy. We first model a PA problem as a static multi-agent game with P teams (agents), in which each team is represented by a specific pilot. We then define a multi-agent structure that can automatically determine the most appropriate PA policy in a distributed manner. The numerical results demonstrate that the proposed PA algorithm outperforms previous suboptimal algorithms in terms of the per-user spectral efficiency (SE). In particular, the proposed approach can increase the average SE and 95%-likely SE by approximately 2.2% and 3.3%, respectively, compared to the best state-of-the-art solution.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2022.3221935