Data-Driven Design of a Reference Governor Using Deep Reinforcement Learning

Reference tracking systems involve a plant that is stabilized by a local feedback controller and a command center that indicates the reference set-point the plant should follow. Typically, these systems are subject to limitations such as disturbances, systems delays, constraints, uncertainties, unde...

Full description

Saved in:

Bibliographic Details
Published in	Control Technology and Applications (Online) pp. 956 - 961
Main Authors	Taylor, Maria Angelica, Giraldo, Luis Felipe
Format	Conference Proceeding
Language	English
Published	IEEE 09.08.2021
Subjects	Delay effects Delays Performance evaluation Process control Reinforcement learning Trajectory tracking Uncertainty
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Reference tracking systems involve a plant that is stabilized by a local feedback controller and a command center that indicates the reference set-point the plant should follow. Typically, these systems are subject to limitations such as disturbances, systems delays, constraints, uncertainties, underperforming controllers, and unmodeled parameters that do not allow them to achieve the desired performance. In situations where it is not possible to redesign the closed-loop system, it is usual to incorporate a reference governor that instructs the system to follow a modified reference path such that the resultant path is close to the ideal one. Typically, strategies to design the reference governor need to know a model of the system, which can be an unfeasible task. In this paper, we propose a framework based on deep reinforcement learning that can learn a policy to generate a modified reference that improves the system's performance in a non-invasive and model-free fashion. To illustrate the effectiveness of our approach, we present two challenging cases in engineering: a flight control with a pilot model that includes human reaction delays, and a mean-field control problem for a massive number of space-heating devices. The proposed strategy successfully designs a reference signal that works even in situations that were not seen during the learning process.
ISSN:	2768-0770
DOI:	10.1109/CCTA48906.2021.9658973