Deep Reinforcement Learning-Based Automatic Exploration for Navigation in Unknown Environment

This paper investigates the automatic exploration problem under the unknown environment, which is the key point of applying the robotic system to some social tasks. The solution to this problem via stacking decision rules is impossible to cover various environments and sensor properties. Learning-ba...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transaction on neural networks and learning systems Vol. 31; no. 6; pp. 2064 - 2076
Main Authors	Li, Haoran, Zhang, Qichao, Zhao, Dongbin
Format	Journal Article
Language	English
Published	United States IEEE 01.06.2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Adaptability Adaptive control Algorithms Artificial neural networks Automatic exploration Computer simulation Control methods Deep learning deep reinforcement learning (DRL) Entropy Exploration Learning Machine learning Mapping Modularity Navigation Neural networks optimal decision partial observation Planning Reinforcement Robot sensing systems Robots Task analysis Unknown environments
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper investigates the automatic exploration problem under the unknown environment, which is the key point of applying the robotic system to some social tasks. The solution to this problem via stacking decision rules is impossible to cover various environments and sensor properties. Learning-based control methods are adaptive for these scenarios. However, these methods are damaged by low learning efficiency and awkward transferability from simulation to reality. In this paper, we construct a general exploration framework via decomposing the exploration process into the decision, planning, and mapping modules, which increases the modularity of the robotic system. Based on this framework, we propose a deep reinforcement learning-based decision algorithm that uses a deep neural network to learning exploration strategy from the partial map. The results show that this proposed algorithm has better learning efficiency and adaptability for unknown environments. In addition, we conduct the experiments on the physical robot, and the results suggest that the learned policy can be well transferred from simulation to the real robot.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2162-237X 2162-2388 2162-2388
DOI:	10.1109/TNNLS.2019.2927869