Staleness aware semi-asynchronous federated learning

As the attempts to distribute deep learning using personal data have increased, the importance of federated learning (FL) has also increased. Attempts have been made to overcome the core challenges of federated learning (i.e., statistical and system heterogeneity) using synchronous or asynchronous p...

Full description

Saved in:
Bibliographic Details
Published inJournal of parallel and distributed computing Vol. 193; p. 104950
Main Authors Yu, Miri, Choi, Jiheon, Lee, Jaehyun, Oh, Sangyoon
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.11.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:As the attempts to distribute deep learning using personal data have increased, the importance of federated learning (FL) has also increased. Attempts have been made to overcome the core challenges of federated learning (i.e., statistical and system heterogeneity) using synchronous or asynchronous protocols. However, stragglers reduce training efficiency in terms of latency and accuracy in each protocol, respectively. To solve straggler issues, a semi-asynchronous protocol that combines the two protocols can be applied to FL; however, effectively handling the staleness of the local model is a difficult problem. We proposed SASAFL to solve the training inefficiency caused by staleness in semi-asynchronous FL. SASAFL enables stable training by considering the quality of the global model to synchronise the servers and clients. In addition, it achieves high accuracy and low latency by adjusting the number of participating clients in response to changes in global loss and immediately processing clients that did not to participate in the previous round. An evaluation was conducted under various conditions to verify the effectiveness of the SASAFL. SASAFL achieved 19.69%p higher accuracy than the baseline, 2.32 times higher round-to-accuracy and 2.24 times higher latency-to-accuracy. Additionally, SASAFL always achieved target accuracy that the baseline can't reach.
ISSN:0743-7315
DOI:10.1016/j.jpdc.2024.104950