Finite-Time Analysis of Over-the-Air Federated TD Learning
In recent years, federated learning has been widely studied to speed up various supervised learning tasks at the wireless network edge. However, there is a lack of theoretical understanding as to whether similar speedups in sample complexity can be achieved for cooperative reinforcement learning (RL...
Saved in:
Published in | IEEE transactions on wireless communications p. 1 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
IEEE
03.04.2025
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In recent years, federated learning has been widely studied to speed up various supervised learning tasks at the wireless network edge. However, there is a lack of theoretical understanding as to whether similar speedups in sample complexity can be achieved for cooperative reinforcement learning (RL) problems subject to communication constraints. To that end, we study a federated policy evaluation problem over wireless fading channels where, to update model parameters, a central server aggregates local temporal difference (TD) update directions from N agents via analog over-the-air computation (OAC). We refer to this scheme as OAC-FedTD and provide a rigorous finite-time convergence analysis of its performance. Our analysis reveals the impact of the noisy fading channels on the convergence rate and establishes a linear convergence speedup w.r.t. the number of agents. Notably, this is the first non-asymptotic analysis of a cooperative RL setting under wireless channels that jointly considers linear value function approximation, Markovian sampling, and the OAC channel-induced distortions and noise. Our work develops the theoretical foundations that are key for relevant advancements in the analysis and design of federated reinforcement learning algorithms over wireless networks. |
---|---|
ISSN: | 1536-1276 1558-2248 |
DOI: | 10.1109/TWC.2025.3555941 |