Fuzzy-Based Adaptive Optimization of Unknown Discrete-Time Nonlinear Markov Jump Systems With Off-Policy Reinforcement Learning

This article explores a novel adaptive optimal control strategy for a class of sophisticated discrete-time nonlinear Markov jump systems (DTNMJSs) via Takagi-Sugeno fuzzy models and reinforcement learning (RL) techniques. First, the original nonlinear system model is represented by fuzzy approximati...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on fuzzy systems Vol. 30; no. 12; pp. 5276 - 5290
Main Authors	Fang, Haiyang, Tu, Yidong, Wang, Hai, He, Shuping, Liu, Fei, Ding, Zhengtao, Cheng, Shing Shin
Format	Journal Article
Language	English
Published	New York IEEE 01.12.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Adaptive control Algorithms Discrete time systems Equivalence Fuzzy control Fuzzy coupled algebraic Riccati equations (FCAREs) Fuzzy systems Heuristic algorithms Learning Markov processes Mathematical models Nonlinear systems off-policy iteration Optimal control Optimization reinforcement learning (RL) Riccati equation Robot arms System dynamics Takagi–Sugeno (T–S) fuzzy models Transition probabilities
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This article explores a novel adaptive optimal control strategy for a class of sophisticated discrete-time nonlinear Markov jump systems (DTNMJSs) via Takagi-Sugeno fuzzy models and reinforcement learning (RL) techniques. First, the original nonlinear system model is represented by fuzzy approximation, while the relevant optimal control problem is equivalent to designing fuzzy controllers for linear fuzzy systems with Markov jumping parameters. Subsequently, we derive the fuzzy coupled algebraic Riccati equations for the fuzzy-based discrete-time linear Markov jump systems by using Hamiltonian-Bellman methods. Following this, an online fuzzy optimization algorithm for DTNMJSs as well as the associated equivalence proof is given. Then, a fully model-free off-policy fuzzy RL algorithm is derived with proved convergence for the DTNMJSs without using the information of system dynamics and transition probability. Finally, two simulation examples, respectively, related to the single-link robotic arm and the half-car active suspension are given to verify the effectiveness and good performance of the proposed approach.
ISSN:	1063-6706 1941-0034
DOI:	10.1109/TFUZZ.2022.3171844