Dynamic Spectrum Interaction of UAV Flight Formation Communication With Priority: A Deep Reinforcement Learning Approach

The formation flights of multiple unmanned aerial vehicles (UAV) can improve the success probability of single-machine. Dynamic spectrum interaction solves the problem of the ordered communication of multiple UAVs with limited bandwidth via spectrum interaction between UAVs. By introducing reinforce...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on cognitive communications and networking Vol. 6; no. 3; pp. 892 - 903
Main Authors	Lin, Yun, Wang, Meiyu, Zhou, Xianglong, Ding, Guoru, Mao, Shiwen
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.09.2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Communication Convergence Deep learning deep reinforcement learning (DRL) Dynamic scheduling Information management Information sharing long-short-term memory (LSTM) Loss measurement M/G/1 queuing model Machine learning Multi-unmanned aerial vehicles (UAV) quality of experience(QoE) Queues Reinforcement learning Resource management self-determination Slot allocation Task analysis Unmanned aerial vehicles
Online Access	Get full text
ISSN	2332-7731 2332-7731
DOI	10.1109/TCCN.2020.2973376

Cover

Loading…

More Information
Summary:	The formation flights of multiple unmanned aerial vehicles (UAV) can improve the success probability of single-machine. Dynamic spectrum interaction solves the problem of the ordered communication of multiple UAVs with limited bandwidth via spectrum interaction between UAVs. By introducing reinforcement learning algorithm, UAVs can continuously obtain the optimal strategy by continuously interacting with the environment. In this paper, two types of UAV formation communication methods are studied. One method allows for information sharing between two UAVs in the same time slot. The other method is the adoption of a dynamic time slot allocation scheme to complete the alternate use of time slots by the UAV to realize information sharing. The quality of experience (QoE) is introduced to evaluate the results of UAV sharing, and the M/G/1 queuing model is used for priority and to evaluate the packet loss of UAV. In terms of algorithms, a combination of deep reinforcement learning (DRL) and the long-short-term memory (LSTM) network is adopted to accelerate the convergence speed of the algorithm. The experimental results show that, compared with the Q-learning and deep Q-network (DQN) methods, the proposed method achieves faster convergence and better performance with respect to the throughput rate.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2332-7731 2332-7731
DOI:	10.1109/TCCN.2020.2973376