Integral reinforcement learning-based event-triggered optimal tracking control for modular robot manipulators via non-zero-sum game

Abstract Under an event-triggered mechanism, a non-zero-sum (NZS) game optimal tracking control method for modular robot manipulator (MRM) systems with input constraints is proposed using the adaptive dynamic programming (ADP) method based on integral reinforcement learning (IRL). First, a dynamic m...

Full description

Saved in:

Bibliographic Details
Published in	Measurement science & technology Vol. 35; no. 9; p. 96205
Main Authors	Dong, Bo, Ding, Zhendong, An, Tianjiao, Cui, Yiming, Zhu, Xinye
Format	Journal Article
Language	English
Published	01.09.2024
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Abstract Under an event-triggered mechanism, a non-zero-sum (NZS) game optimal tracking control method for modular robot manipulator (MRM) systems with input constraints is proposed using the adaptive dynamic programming (ADP) method based on integral reinforcement learning (IRL). First, a dynamic model of the MRM system is developed based on joint torque feedback technology, consisting of an n -joint subsystem related to interconnected dynamic coupling (IDC). Second, we design a robust compensation controller to handle the known model term and an optimal compensation controller to deal with the uncertainty term caused by the IDC and friction, respectively. In addition, a nonlinear disturbance observer is established to dispose of the negative effects caused by the uncertain sensor output disturbance. Third, based on differential game theory, we transform the optimal tracking control problem of the MRM system into an n -player NZS game problem. Then, the IRL-based ADP method is adopted, which relaxes the need for system partial unknown dynamic information, and only a critic neural network is used to solve the coupled Hamilton–Jacobi equation, so as to obtain the optimal control policy. Then, using Lyapunov theory, the tracking error of the MRM system is demonstrated to be uniformly ultimately bounded. Finally, the effectiveness and superiority of the proposed algorithm are verified through experiments.
ISSN:	0957-0233 1361-6501
DOI:	10.1088/1361-6501/ad50f8