Online Learning for Position-Aided Millimeter Wave Beam Training

Accurate beam alignment is essential for the beam-based millimeter wave communications. The conventional beam sweeping solutions often have large overhead, which is unacceptable for mobile applications, such as a vehicle to everything. The learning-based solutions that leverage the sensor data (e.g....

Full description

Saved in:

Bibliographic Details
Published in	IEEE access Vol. 7; pp. 30507 - 30526
Main Authors	Va, Vutha, Shimizu, Takayuki, Bansal, Gaurav, Heath, Robert W.
Format	Journal Article
Language	English
Published	Piscataway IEEE 2019 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Angular position Applications programs Array signal processing beam alignment beam refinement Distance learning Heating systems Machine learning Millimeter wave Millimeter wave technology Millimeter waves Mobile computing multi-armed bandit Multi-armed bandit problems online learning Optimization Position sensing position-aided risk-aware learning Training Transportation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Accurate beam alignment is essential for the beam-based millimeter wave communications. The conventional beam sweeping solutions often have large overhead, which is unacceptable for mobile applications, such as a vehicle to everything. The learning-based solutions that leverage the sensor data (e.g., position) to identify the good beam directions are one approach to reduce the overhead. Most existing solutions, though, are supervised learning, where the training data are collected beforehand. In this paper, we use a multi-armed bandit framework to develop the online learning algorithms for beam pair selection and refinement. The beam pair selection algorithm learns coarse beam directions in some predefined beam codebook, e.g., in discrete angles, separated by the 3 dB beamwidths. The beam refinement fine-tunes the identified directions to match the peak of the power angular spectrum at that position. The beam pair selection uses the upper confidence bound with a newly proposed risk-aware feature, while the beam refinement uses a modified optimistic optimization algorithm. The proposed algorithms learn to recommend the good beam pairs quickly. When using <inline-formula> <tex-math notation="LaTeX">16\times 16 </tex-math></inline-formula> arrays at both transmitter and receiver, it can achieve, on average, 1-dB gain over the exhaustive search (over <inline-formula> <tex-math notation="LaTeX">271\times 271 </tex-math></inline-formula> beam pairs) on the unrefined codebook within 100 time steps with a training budget of only 30 beam pairs.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2019.2902372