Federated Optimization with Doubly Regularized Drift Correction
Federated learning is a distributed optimization paradigm that allows training machine learning models across decentralized devices while keeping the data localized. The standard method, FedAvg, suffers from client drift which can hamper performance and increase communication costs over centralized...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
12.04.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Federated learning is a distributed optimization paradigm that allows
training machine learning models across decentralized devices while keeping the
data localized. The standard method, FedAvg, suffers from client drift which
can hamper performance and increase communication costs over centralized
methods. Previous works proposed various strategies to mitigate drift, yet none
have shown uniformly improved communication-computation trade-offs over vanilla
gradient descent.
In this work, we revisit DANE, an established method in distributed
optimization. We show that (i) DANE can achieve the desired communication
reduction under Hessian similarity constraints. Furthermore, (ii) we present an
extension, DANE+, which supports arbitrary inexact local solvers and has more
freedom to choose how to aggregate the local updates. We propose (iii) a novel
method, FedRed, which has improved local computational complexity and retains
the same communication complexity compared to DANE/DANE+. This is achieved by
using doubly regularized drift correction. |
---|---|
DOI: | 10.48550/arxiv.2404.08447 |