Finite-Time Analysis of Asynchronous Multi-Agent TD Learning
Recent research endeavours have theoretically shown the beneficial effect of cooperation in multi-agent reinforcement learning (MARL). In a setting involving $N$ agents, this beneficial effect usually comes in the form of an $N$-fold linear convergence speedup, i.e., a reduction - proportional to $N...
Saved in:
Main Authors | , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
29.07.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Recent research endeavours have theoretically shown the beneficial effect of
cooperation in multi-agent reinforcement learning (MARL). In a setting
involving $N$ agents, this beneficial effect usually comes in the form of an
$N$-fold linear convergence speedup, i.e., a reduction - proportional to $N$ -
in the number of iterations required to reach a certain convergence precision.
In this paper, we show for the first time that this speedup property also holds
for a MARL framework subject to asynchronous delays in the local agents'
updates. In particular, we consider a policy evaluation problem in which
multiple agents cooperate to evaluate a common policy by communicating with a
central aggregator. In this setting, we study the finite-time convergence of
\texttt{AsyncMATD}, an asynchronous multi-agent temporal difference (TD)
learning algorithm in which agents' local TD update directions are subject to
asynchronous bounded delays. Our main contribution is providing a finite-time
analysis of \texttt{AsyncMATD}, for which we establish a linear convergence
speedup while highlighting the effect of time-varying asynchronous delays on
the resulting convergence rate. |
---|---|
DOI: | 10.48550/arxiv.2407.20441 |