Federated Optimization with Doubly Regularized Drift Correction

Federated learning is a distributed optimization paradigm that allows training machine learning models across decentralized devices while keeping the data localized. The standard method, FedAvg, suffers from client drift which can hamper performance and increase communication costs over centralized...

Full description

Saved in:

Bibliographic Details
Main Authors	Jiang, Xiaowen, Rodomanov, Anton, Stich, Sebastian U
Format	Journal Article
Language	English
Published	12.04.2024
Subjects	Computer Science - Learning Mathematics - Optimization and Control
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Federated learning is a distributed optimization paradigm that allows training machine learning models across decentralized devices while keeping the data localized. The standard method, FedAvg, suffers from client drift which can hamper performance and increase communication costs over centralized methods. Previous works proposed various strategies to mitigate drift, yet none have shown uniformly improved communication-computation trade-offs over vanilla gradient descent. In this work, we revisit DANE, an established method in distributed optimization. We show that (i) DANE can achieve the desired communication reduction under Hessian similarity constraints. Furthermore, (ii) we present an extension, DANE+, which supports arbitrary inexact local solvers and has more freedom to choose how to aggregate the local updates. We propose (iii) a novel method, FedRed, which has improved local computational complexity and retains the same communication complexity compared to DANE/DANE+. This is achieved by using doubly regularized drift correction.
DOI:	10.48550/arxiv.2404.08447