The Path Planning of Mobile Robot by Neural Networks and Hierarchical Reinforcement Learning

To solve the problems that the existing mobile robots cannot complete autonomous learning in the path planning, as well as the slow convergence of path planning and poor smoothness of planned paths, the neural networks are utilized to perceive the environment and perform feature extraction to achiev...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in neurorobotics Vol. 14; p. 63
Main Authors Yu, Jinglun, Su, Yuancheng, Liao, Yifan
Format Journal Article
LanguageEnglish
Published Lausanne Frontiers Research Foundation 02.10.2020
Frontiers Media S.A
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:To solve the problems that the existing mobile robots cannot complete autonomous learning in the path planning, as well as the slow convergence of path planning and poor smoothness of planned paths, the neural networks are utilized to perceive the environment and perform feature extraction to achieve the fitness of environment to state action function. Through the mapping of the current state to the action through Hierarchical Reinforcement Learning (HRL), the mobile needs of mobile robots are met, a path planning model for mobile robots based on neural networks and HRL is finally constructed. The proposed algorithm is compared with different algorithms in path planning and underwent performance evaluation to obtain the optimal learning algorithm system. Finally, the optimal algorithm system is tested in different environments and scenarios to obtain the optimal learning conditions, thereby verifying the effectiveness of the proposed algorithm. The experimental results show that the Actor-Critic (A3C) algorithm based on reinforcement learning is more effective than the traditional Q-Learning algorithm. After using the neural network algorithm, the path planning ability of robots is significantly improved. Compared with the Double Deep Q-Learning (DDQN) algorithm, under the Actor-Critic Deep Q-Learning (DDPG) algorithm based on neural network and HRL, the path planning time is increased by 22.48%, the number of path steps is increased by 8.69%, the convergence time is increased by 55.52%, and the cumulative rewards are increased significantly. When the action set is 4, the number of grids is 3, and the state set is 40*40*8, with the introduction of the force value, the algorithm can reduce the convergence time by 91% compared with the traditional Q-learning algorithm, and the smoothness of the path is increased by 79%. The algorithm has good generalization effect in different scenarios. The above results have important theoretical value and guiding significance for the research on the precise positioning and path planning of mobile robots.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Reviewed by: Yinyan Zhang, Jinan University, China; Kuan-Yu Lin, Ling Tung University, Taiwan
Edited by: Mu-Yen Chen, National Taichung University of Science and Technology, Taiwan
ISSN:1662-5218
1662-5218
DOI:10.3389/fnbot.2020.00063