Privacy Preserving Reinforcement Learning for Population Processes
We consider the problem of privacy protection in Reinforcement Learning (RL) algorithms that operate over population processes, a practical but understudied setting that includes, for example, the control of epidemics in large populations of dynamically interacting individuals. In this setting, the...
Saved in:
Main Authors | , |
---|---|
Format | Journal Article |
Language | English |
Published |
25.06.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | We consider the problem of privacy protection in Reinforcement Learning (RL)
algorithms that operate over population processes, a practical but understudied
setting that includes, for example, the control of epidemics in large
populations of dynamically interacting individuals. In this setting, the RL
algorithm interacts with the population over $T$ time steps by receiving
population-level statistics as state and performing actions which can affect
the entire population at each time step. An individual's data can be collected
across multiple interactions and their privacy must be protected at all times.
We clarify the Bayesian semantics of Differential Privacy (DP) in the presence
of correlated data in population processes through a Pufferfish Privacy
analysis. We then give a meta algorithm that can take any RL algorithm as input
and make it differentially private. This is achieved by taking an approach that
uses DP mechanisms to privatize the state and reward signal at each time step
before the RL algorithm receives them as input. Our main theoretical result
shows that the value-function approximation error when applying standard RL
algorithms directly to the privatized states shrinks quickly as the population
size and privacy budget increase. This highlights that reasonable
privacy-utility trade-offs are possible for differentially private RL
algorithms in population processes. Our theoretical findings are validated by
experiments performed on a simulated epidemic control problem over large
population sizes. |
---|---|
DOI: | 10.48550/arxiv.2406.17649 |