Maximum Entropy Optimal Density Control of Discrete-Time Linear Systems and Schrödinger Bridges

We consider an entropy-regularized version of optimal density control of deterministic discrete-time linear systems. Entropy regularization, or a maximum entropy (MaxEnt) method for optimal control has attracted much attention especially in reinforcement learning due to its many advantages such as a...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on automatic control Vol. 69; no. 3; pp. 1 - 16
Main Authors	Ito, Kaito, Kashima, Kenji
Format	Journal Article
Language	English
Published	New York IEEE 01.03.2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Bridges Control systems Costs Density Discrete time systems Entropy Gaussian distribution Linear systems Maximum entropy Optimal control Regularization Safety critical Schrödinger bridge stochastic control Stochastic processes Uncertainty
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We consider an entropy-regularized version of optimal density control of deterministic discrete-time linear systems. Entropy regularization, or a maximum entropy (MaxEnt) method for optimal control has attracted much attention especially in reinforcement learning due to its many advantages such as a natural exploration strategy. Despite the merits, high-entropy control policies induced by the regularization introduce probabilistic uncertainty into systems, which severely limits the applicability of MaxEnt optimal control to safety-critical systems. To remedy this situation, we impose a Gaussian density constraint at a specified time on the MaxEnt optimal control to directly control state uncertainty. Specifically, we derive the explicit form of the MaxEnt optimal density control. In addition, we also consider the case where density constraints are replaced by fixed point constraints. Then, we characterize the associated state process as a pinned process, which is a generalization of the Brownian bridge to linear systems. Finally, we reveal that the MaxEnt optimal density control gives the so-called Schrödinger bridge associated to a discrete-time linear system.
ISSN:	0018-9286 1558-2523
DOI:	10.1109/TAC.2023.3305319