Dynamic Spectrum Interaction of UAV Flight Formation Communication With Priority: A Deep Reinforcement Learning Approach

The formation flights of multiple unmanned aerial vehicles (UAV) can improve the success probability of single-machine. Dynamic spectrum interaction solves the problem of the ordered communication of multiple UAVs with limited bandwidth via spectrum interaction between UAVs. By introducing reinforce...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on cognitive communications and networking Vol. 6; no. 3; pp. 892 - 903
Main Authors Lin, Yun, Wang, Meiyu, Zhou, Xianglong, Ding, Guoru, Mao, Shiwen
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.09.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN2332-7731
2332-7731
DOI10.1109/TCCN.2020.2973376

Cover

Loading…
More Information
Summary:The formation flights of multiple unmanned aerial vehicles (UAV) can improve the success probability of single-machine. Dynamic spectrum interaction solves the problem of the ordered communication of multiple UAVs with limited bandwidth via spectrum interaction between UAVs. By introducing reinforcement learning algorithm, UAVs can continuously obtain the optimal strategy by continuously interacting with the environment. In this paper, two types of UAV formation communication methods are studied. One method allows for information sharing between two UAVs in the same time slot. The other method is the adoption of a dynamic time slot allocation scheme to complete the alternate use of time slots by the UAV to realize information sharing. The quality of experience (QoE) is introduced to evaluate the results of UAV sharing, and the M/G/1 queuing model is used for priority and to evaluate the packet loss of UAV. In terms of algorithms, a combination of deep reinforcement learning (DRL) and the long-short-term memory (LSTM) network is adopted to accelerate the convergence speed of the algorithm. The experimental results show that, compared with the Q-learning and deep Q-network (DQN) methods, the proposed method achieves faster convergence and better performance with respect to the throughput rate.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2332-7731
2332-7731
DOI:10.1109/TCCN.2020.2973376