Confrontation and Obstacle-Avoidance of Unmanned Vehicles Based on Progressive Reinforcement Learning

The core technique of unmanned vehicle systems is the autonomous maneuvering decision, which not only determines the applications of unmanned vehicles but also is the critical technique many countries are competing to develop. Reinforcement Learning (RL) is the potential design method for autonomous...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 11; pp. 50398 - 50411
Main Authors Ma, Chengdong, Liu, Jianan, He, Saichao, Hong, Wenjing, Shi, Jia
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The core technique of unmanned vehicle systems is the autonomous maneuvering decision, which not only determines the applications of unmanned vehicles but also is the critical technique many countries are competing to develop. Reinforcement Learning (RL) is the potential design method for autonomous maneuvering decision-making systems. Nevertheless, in the face of complex decision-making tasks, it is still challenging to master the optimal policy due to the low learning efficiency caused by the complex environment, high dimensional state, and sparse reward. Inspired by the human learning process from simple to complex, we propose a novel progressive deep RL algorithm for policy optimization in unmanned autonomous decision-making systems in this paper. The proposed algorithm divides the training of the autonomous maneuvering decision into a sequence of curricula with learning tasks from simple to complex. Finally, through the self-play stage, the iterative optimization of the policy is realized. Furthermore, the confrontation environment with two unmanned vehicles with obstacles is analyzed and modeled. Finally, the simulation leads to the one-to-one adversarial tasks demonstrate the effectiveness and applicability of the proposed design algorithm.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2023.3278597