CBMoS: Combinatorial Bandit Learning for Mode Selection and Resource Allocation in D2D Systems
The complexity of the mode selection and resource allocation (MS&RA) problem has hampered the commercialization progress of Device-to-Device (D2D) communication in 5G networks. Furthermore, the combinatorial nature of MS&RA has forced the majority of existing proposals to focus on constraine...
Saved in:
Published in | IEEE journal on selected areas in communications Vol. 37; no. 10; pp. 2225 - 2238 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.10.2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
ISSN | 0733-8716 1558-0008 |
DOI | 10.1109/JSAC.2019.2933764 |
Cover
Summary: | The complexity of the mode selection and resource allocation (MS&RA) problem has hampered the commercialization progress of Device-to-Device (D2D) communication in 5G networks. Furthermore, the combinatorial nature of MS&RA has forced the majority of existing proposals to focus on constrained scenarios or offline solutions to contain the size of the problem. Given the real-time constraints in actual deployments, a reduction in computational complexity is necessary. Adaptability is another key requirement for mobile networks that are exposed to constant changes such as channel quality fluctuations and mobility. In this article, we propose an online learning technique (i.e., CBMoS) which leverages combinatorial multi-armed bandits (CMAB) to tackle the combinatorial nature of MS&RA. Furthermore, our two-stage CMAB design results in a tight model, which eliminates the theoretically feasible but practicality invalid options from the solution space. We prototype the first SDR-based D2D testbed to verify the performance of CBMoS under real-world conditions. The simulations confirm that the fast learning speed of CBMoS leads to outperforming the benchmark schemes by up to 132%. In experiments, CBMoS exhibits even higher performance (up to 142%) than in the simulations. This stems from the adaptability/fast learning speed of CBMoS in presence of high channel dynamics which cannot be captured via statistical channel models used in the simulators. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 0733-8716 1558-0008 |
DOI: | 10.1109/JSAC.2019.2933764 |