Model-Based Reinforcement Learning for Control of Strongly-Disturbed Unsteady Aerodynamic Flows
The intrinsic high dimension of fluid dynamics is an inherent challenge to control of aerodynamic flows, and this is further complicated by a flow's nonlinear response to strong disturbances. Deep reinforcement learning, which takes advantage of the exploratory aspects of reinforcement learning...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
26.08.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The intrinsic high dimension of fluid dynamics is an inherent challenge to
control of aerodynamic flows, and this is further complicated by a flow's
nonlinear response to strong disturbances. Deep reinforcement learning, which
takes advantage of the exploratory aspects of reinforcement learning (RL) and
the rich nonlinearity of a deep neural network, provides a promising approach
to discover feasible control strategies. However, the typical model-free
approach to reinforcement learning requires a significant amount of interaction
between the flow environment and the RL agent during training, and this high
training cost impedes its development and application. In this work, we propose
a model-based reinforcement learning (MBRL) approach by incorporating a novel
reduced-order model as a surrogate for the full environment. The model consists
of a physics-augmented autoencoder, which compresses high-dimensional CFD flow
field snaphsots into a three-dimensional latent space, and a latent dynamics
model that is trained to accurately predict the long-time dynamics of
trajectories in the latent space in response to action sequences. The
robustness and generalizability of the model is demonstrated in two distinct
flow environments, a pitching airfoil in a highly disturbed environment and a
vertical-axis wind turbine in a disturbance-free environment. Based on the
trained model in the first problem, we realize an MBRL strategy to mitigate
lift variation during gust-airfoil encounters. We demonstrate that the policy
learned in the reduced-order environment translates to an effective control
strategy in the full CFD environment. |
---|---|
DOI: | 10.48550/arxiv.2408.14685 |