Data-Driven Performance-Prescribed Reinforcement Learning Control of an Unmanned Surface Vehicle

An unmanned surface vehicle (USV) under complicated marine environments can hardly be modeled well such that model-based optimal control approaches become infeasible. In this article, a self-learning-based model-free solution only using input-output signals of the USV is innovatively provided. To th...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transaction on neural networks and learning systems Vol. 32; no. 12; pp. 5456 - 5467
Main Authors	Wang, Ning, Gao, Ying, Zhang, Xuefeng
Format	Journal Article
Language	English
Published	United States IEEE 01.12.2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Constraints Controllers Data-driven control Field-flow fractionation Learning Marine environment Neural networks Optimal control Optimization performance-prescribed control Reinforcement Reinforcement learning reinforcement learning control Steady-state Surface vehicles System dynamics Theoretical analysis Tracking control Tracking errors Transient analysis unmanned surface vehicle (USV) Unmanned vehicles Vehicle dynamics Virtual reality
Online Access	Get full text
ISSN	2162-237X 2162-2388 2162-2388
DOI	10.1109/TNNLS.2021.3056444

Cover

Loading…

More Information
Summary:	An unmanned surface vehicle (USV) under complicated marine environments can hardly be modeled well such that model-based optimal control approaches become infeasible. In this article, a self-learning-based model-free solution only using input-output signals of the USV is innovatively provided. To this end, a data-driven performance-prescribed reinforcement learning control (DPRLC) scheme is created to pursue control optimality and prescribed tracking accuracy simultaneously. By devising state transformation with prescribed performance, constrained tracking errors are substantially converted into constraint-free stabilization of tracking errors with unknown dynamics. Reinforcement learning paradigm using neural network-based actor-critic learning framework is further deployed to directly optimize controller synthesis deduced from the Bellman error formulation such that transformed tracking errors evolve a data-driven optimal controller. Theoretical analysis eventually ensures that the entire DPRLC scheme can guarantee prescribed tracking accuracy, subject to optimal cost. Both simulations and virtual-reality experiments demonstrate the remarkable effectiveness and superiority of the proposed DPRLC scheme.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2162-237X 2162-2388 2162-2388
DOI:	10.1109/TNNLS.2021.3056444