Decoupled Data-Based Approach for Learning to Control Nonlinear Dynamical Systems

This article addresses the problem of learning the optimal control policy for a nonlinear stochastic dynamical. This problem is subject to the "curse of dimensionality" associated with the dynamic programming method. This article proposes a novel decoupled data-based control (D2C) algorith...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on automatic control Vol. 67; no. 7; pp. 3582 - 3589
Main Authors	Wang, Ran, Parunandi, Karthikeya S., Yu, Dan, Kalathil, Dileep, Chakravorty, Suman
Format	Journal Article
Language	English
Published	New York IEEE 01.07.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Approximation algorithms Computational modeling Data models Dynamic programming Feedback control Heuristic algorithms Learning Linear quadratic regulator Linearization Nonlinear control Nonlinear systems Optimal control Reinforcement learning stochastic control Stochastic processes Trajectory Trajectory optimization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This article addresses the problem of learning the optimal control policy for a nonlinear stochastic dynamical. This problem is subject to the "curse of dimensionality" associated with the dynamic programming method. This article proposes a novel decoupled data-based control (D2C) algorithm that addresses this problem using a decoupled, "open-loop-closed-loop," approach. First, an open-loop deterministic trajectory optimization problem is solved using a black-box simulation model of the dynamical system. Then, closed-loop control is developed around this open-loop trajectory by linearization of the dynamics about this nominal trajectory. By virtue of linearization, a linear quadratic regulator based algorithm can be used for this closed-loop control. We show that the performance of D2C algorithm is approximately optimal. Moreover, simulation performance suggests a significant reduction in training time compared to other state-of-the-art algorithms.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0018-9286 1558-2523
DOI:	10.1109/TAC.2021.3108552