Autonomous Navigation of UAVs in Large-Scale Complex Environments: A Deep Reinforcement Learning Approach

In this paper, we propose a deep reinforcement learning (DRL)-based method that allows unmanned aerial vehicles (UAVs) to execute navigation tasks in large-scale complex environments. This technique is important for many applications such as goods delivery and remote surveillance. The problem is for...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on vehicular technology Vol. 68; no. 3; pp. 2124 - 2136
Main Authors	Wang, Chao, Wang, Jian, Shen, Yuan, Zhang, Xudong
Format	Journal Article
Language	English
Published	New York IEEE 01.03.2019 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Aerospace electronics Algorithms Autonomous navigation deep reinforcement learning Markov analysis Markov processes Navigation partially observable Markov decision process Path planning Reinforcement learning Sensors Simultaneous localization and mapping Task complexity Unmanned aerial vehicles Virtual environments
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we propose a deep reinforcement learning (DRL)-based method that allows unmanned aerial vehicles (UAVs) to execute navigation tasks in large-scale complex environments. This technique is important for many applications such as goods delivery and remote surveillance. The problem is formulated as a partially observable Markov decision process (POMDP) and solved by a novel online DRL algorithm designed based on two strictly proved policy gradient theorems within the actor-critic framework. In contrast to conventional simultaneous localization and mapping-based or sensing and avoidance-based approaches, our method directly maps UAVs' raw sensory measurements into control signals for navigation. Experiment results demonstrate that our method can enable UAVs to autonomously perform navigation in a virtual large-scale complex environment and can be generalized to more complex, larger-scale, and three-dimensional environments. Besides, the proposed online DRL algorithm addressing POMDPs outperforms the state-of-the-art.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0018-9545 1939-9359
DOI:	10.1109/TVT.2018.2890773