Decentralised federated learning with adaptive partial gradient aggregation

Federated learning aims to collaboratively train a machine learning model with possibly geo-distributed workers, which is inherently communication constrained. To achieve communication efficiency, the conventional federated learning algorithms allow the worker to decrease the communication frequency...

Full description

Saved in:

Bibliographic Details
Published in	CAAI Transactions on Intelligence Technology Vol. 5; no. 3; pp. 230 - 236
Main Authors	Jiang, Jingyan, Hu, Liang
Format	Journal Article
Language	English
Published	Beijing The Institution of Engineering and Technology 01.09.2020 John Wiley & Sons, Inc Wiley
Subjects	adaptive model adaptive partial gradient aggregation method Algorithms Bandwidths Communication communication efficiency communication frequency communication time conventional federated learning algorithms Convergence decentralised federated learning Design parameters end-to-end training Federated learning geo-distributed workers gradient methods gradient partial level decentralised inherently communication learning (artificial intelligence) Machine learning machine learning model Methods Model updating Network latency node-to-node bandwidth nodes-to-server bandwidths parameter server design partial gradient exchange mechanism real-world federated learning scenarios Research Article stochastic gradient descent training Topology training time Wide area networks adaptive model real-world federated learning scenarios nodes-to-server bandwidths inherently communication gradient partial level decentralised adaptive partial gradient aggregation method decentralised federated learning geo-distributed workers partial gradient exchange mechanism communication frequency end-to-end training parameter server design communication time learning (artificial intelligence) machine learning model communication efficiency conventional federated learning algorithms gradient methods training time stochastic gradient descent training node-to-node bandwidth
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Federated learning aims to collaboratively train a machine learning model with possibly geo-distributed workers, which is inherently communication constrained. To achieve communication efficiency, the conventional federated learning algorithms allow the worker to decrease the communication frequency by training the model locally for multiple times. Conventional federated learning architecture, inherited from the parameter server design, relies on highly centralised topologies and large nodes-to-server bandwidths, and convergence property relies on the stochastic gradient descent training in local, which usually causes the large end-to-end training latency in real-world federated learning scenarios. Thus, in this study, the authors propose the adaptive partial gradient aggregation method, a gradient partial level decentralised federated learning, to tackle this problem. In FedPGA, they propose a partial gradient exchange mechanism that makes full use of node-to-node bandwidth for speeding up the communication time. Besides, an adaptive model updating method further reduces the convergence rate by adaptive increasing the step size of the stable direction of gradient descent. The experimental results on various datasets demonstrate that the training time is reduced up to $14 \times $14× compared to baselines without accuracy degrade.
ISSN:	2468-2322 2468-2322
DOI:	10.1049/trit.2020.0082