Staleness aware semi-asynchronous federated learning

As the attempts to distribute deep learning using personal data have increased, the importance of federated learning (FL) has also increased. Attempts have been made to overcome the core challenges of federated learning (i.e., statistical and system heterogeneity) using synchronous or asynchronous p...

Full description

Saved in:

Bibliographic Details
Published in	Journal of parallel and distributed computing Vol. 193; p. 104950
Main Authors	Yu, Miri, Choi, Jiheon, Lee, Jaehyun, Oh, Sangyoon
Format	Journal Article
Language	English
Published	Elsevier Inc 01.11.2024
Subjects	Federated learning Semi-asynchronous Staleness Semi-asynchronous Staleness Federated learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	As the attempts to distribute deep learning using personal data have increased, the importance of federated learning (FL) has also increased. Attempts have been made to overcome the core challenges of federated learning (i.e., statistical and system heterogeneity) using synchronous or asynchronous protocols. However, stragglers reduce training efficiency in terms of latency and accuracy in each protocol, respectively. To solve straggler issues, a semi-asynchronous protocol that combines the two protocols can be applied to FL; however, effectively handling the staleness of the local model is a difficult problem. We proposed SASAFL to solve the training inefficiency caused by staleness in semi-asynchronous FL. SASAFL enables stable training by considering the quality of the global model to synchronise the servers and clients. In addition, it achieves high accuracy and low latency by adjusting the number of participating clients in response to changes in global loss and immediately processing clients that did not to participate in the previous round. An evaluation was conducted under various conditions to verify the effectiveness of the SASAFL. SASAFL achieved 19.69%p higher accuracy than the baseline, 2.32 times higher round-to-accuracy and 2.24 times higher latency-to-accuracy. Additionally, SASAFL always achieved target accuracy that the baseline can't reach.
ISSN:	0743-7315
DOI:	10.1016/j.jpdc.2024.104950