The Path Planning of Mobile Robot by Neural Networks and Hierarchical Reinforcement Learning
To solve the problems that the existing mobile robots cannot complete autonomous learning in the path planning, as well as the slow convergence of path planning and poor smoothness of planned paths, the neural networks are utilized to perceive the environment and perform feature extraction to achiev...
Saved in:
Published in | Frontiers in neurorobotics Vol. 14; p. 63 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Lausanne
Frontiers Research Foundation
02.10.2020
Frontiers Media S.A |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | To solve the problems that the existing mobile robots cannot complete autonomous learning in the path planning, as well as the slow convergence of path planning and poor smoothness of planned paths, the neural networks are utilized to perceive the environment and perform feature extraction to achieve the fitness of environment to state action function. Through the mapping of the current state to the action through Hierarchical Reinforcement Learning (HRL), the mobile needs of mobile robots are met, a path planning model for mobile robots based on neural networks and HRL is finally constructed. The proposed algorithm is compared with different algorithms in path planning and underwent performance evaluation to obtain the optimal learning algorithm system. Finally, the optimal algorithm system is tested in different environments and scenarios to obtain the optimal learning conditions, thereby verifying the effectiveness of the proposed algorithm. The experimental results show that the Actor-Critic (A3C) algorithm based on reinforcement learning is more effective than the traditional Q-Learning algorithm. After using the neural network algorithm, the path planning ability of robots is significantly improved. Compared with the Double Deep Q-Learning (DDQN) algorithm, under the Actor-Critic Deep Q-Learning (DDPG) algorithm based on neural network and HRL, the path planning time is increased by 22.48%, the number of path steps is increased by 8.69%, the convergence time is increased by 55.52%, and the cumulative rewards are increased significantly. When the action set is 4, the number of grids is 3, and the state set is 40*40*8, with the introduction of the force value, the algorithm can reduce the convergence time by 91% compared with the traditional Q-learning algorithm, and the smoothness of the path is increased by 79%. The algorithm has good generalization effect in different scenarios. The above results have important theoretical value and guiding significance for the research on the precise positioning and path planning of mobile robots. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Reviewed by: Yinyan Zhang, Jinan University, China; Kuan-Yu Lin, Ling Tung University, Taiwan Edited by: Mu-Yen Chen, National Taichung University of Science and Technology, Taiwan |
ISSN: | 1662-5218 1662-5218 |
DOI: | 10.3389/fnbot.2020.00063 |