Deep reinforcement learning with shallow controllers: An experimental application to PID tuning
Deep reinforcement learning (RL) is an optimization-driven framework for producing control strategies for general dynamical systems without explicit reliance on process models. Good results have been reported in simulation. Here we demonstrate the challenges in implementing a state of the art deep R...
Saved in:
Published in | Control engineering practice Vol. 121; p. 105046 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
01.04.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Deep reinforcement learning (RL) is an optimization-driven framework for producing control strategies for general dynamical systems without explicit reliance on process models. Good results have been reported in simulation. Here we demonstrate the challenges in implementing a state of the art deep RL algorithm on a real physical system. Aspects include the interplay between software and existing hardware; experiment design and sample efficiency; training subject to input constraints; and interpretability of the algorithm and control law. At the core of our approach is the use of a PID controller as the trainable RL policy. In addition to its simplicity, this approach has several appealing features: No additional hardware needs to be added to the control system, since a PID controller can easily be implemented through a standard programmable logic controller; the control law can easily be initialized in a “safe” region of the parameter space; and the final product—a well-tuned PID controller—has a form that practitioners can reason about and deploy with confidence.
[Display omitted]
•Reinforcement learning (RL) is used to tune a real-world PID controller.•The RL policy is a PID controller, for compatibility with many current systems.•Good tuning is achieved in roughly 40 min of training time.•Full implementation details and thorough lab results are presented.•A multi-criterion scorecard compares RL with several known auto-tuning methods. |
---|---|
ISSN: | 0967-0661 1873-6939 |
DOI: | 10.1016/j.conengprac.2021.105046 |