A Cooperative Online Learning-Based Load Balancing Scheme for Maximizing QoS Satisfaction in Dense HetNets

This paper proposes a cooperative multi-agent online reinforcement learning-based (COMORL) bias offset (BO) control scheme for cell range expansion (CRE) in dense heterogeneous networks (HetNets). The proposed COMORL scheme controls BOs for CRE to maximize the number of user equipments (UEs) that sa...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 9; pp. 92345 - 92357
Main Authors Choi, Hyungwoo, Kim, Taehwa, Park, Hong-Shik, Choi, Jun Kyun
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper proposes a cooperative multi-agent online reinforcement learning-based (COMORL) bias offset (BO) control scheme for cell range expansion (CRE) in dense heterogeneous networks (HetNets). The proposed COMORL scheme controls BOs for CRE to maximize the number of user equipments (UEs) that satisfy their quality of service (QoS) requirements, especially in terms of delay and data rates. For this purpose, we developed a QoS satisfaction indicator that measures a violation of delay requirements by considering both QoS requirements and signal-to-interference-plus-noise ratio (SINR). In addition, we formulated a Markov decision process (MDP) model that is solved with a cooperative multi-agent online reinforcement learning algorithm. The proposed COMORL scheme maximizes the global utility for load-coupled base stations. Our simulation results verify the proposed COMORL scheme's effectiveness in terms of throughput, delay satisfaction ratio, and fairness. Specifically, we verify that the proposed COMORL scheme achieves a maximum of approximately 27% and 30% improvement of the delay satisfaction ratio, which is how many UEs satisfy their delay requirement among all of the UEs in a serving BS under medium and full traffic loads, respectively, in a dynamic scenario in comparison to the max-SINR scheme.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2021.3089782