Deep reinforcement learning with shallow controllers: An experimental application to PID tuning

Deep reinforcement learning (RL) is an optimization-driven framework for producing control strategies for general dynamical systems without explicit reliance on process models. Good results have been reported in simulation. Here we demonstrate the challenges in implementing a state of the art deep R...

Full description

Saved in:
Bibliographic Details
Published inControl engineering practice Vol. 121; p. 105046
Main Authors Lawrence, Nathan P., Forbes, Michael G., Loewen, Philip D., McClement, Daniel G., Backström, Johan U., Gopaluni, R. Bhushan
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.04.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Deep reinforcement learning (RL) is an optimization-driven framework for producing control strategies for general dynamical systems without explicit reliance on process models. Good results have been reported in simulation. Here we demonstrate the challenges in implementing a state of the art deep RL algorithm on a real physical system. Aspects include the interplay between software and existing hardware; experiment design and sample efficiency; training subject to input constraints; and interpretability of the algorithm and control law. At the core of our approach is the use of a PID controller as the trainable RL policy. In addition to its simplicity, this approach has several appealing features: No additional hardware needs to be added to the control system, since a PID controller can easily be implemented through a standard programmable logic controller; the control law can easily be initialized in a “safe” region of the parameter space; and the final product—a well-tuned PID controller—has a form that practitioners can reason about and deploy with confidence. [Display omitted] •Reinforcement learning (RL) is used to tune a real-world PID controller.•The RL policy is a PID controller, for compatibility with many current systems.•Good tuning is achieved in roughly 40 min of training time.•Full implementation details and thorough lab results are presented.•A multi-criterion scorecard compares RL with several known auto-tuning methods.
ISSN:0967-0661
1873-6939
DOI:10.1016/j.conengprac.2021.105046