Haste makes waste: The On–Off algorithm for replica selection in key–value stores

In current large-scale distributed key–value stores, the tail latency of the key–value accesses generated by end-user requests is crucial to the response time of these requests. To cut the tail latency, the replica selection algorithm, which selects a replica server for each key by a client and thus...

Full description

Saved in:
Bibliographic Details
Published inJournal of parallel and distributed computing Vol. 130; pp. 80 - 90
Main Authors Jiang, Wanchun, Xie, Haiming, Zhou, Xiangqian, Fang, Liyuan, Wang, Jianxin
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.08.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In current large-scale distributed key–value stores, the tail latency of the key–value accesses generated by end-user requests is crucial to the response time of these requests. To cut the tail latency, the replica selection algorithm, which selects a replica server for each key by a client and thus determines the latency of each key–value access, is crucial. Naturally, all of current replica selection algorithms send keys out immediately to reduce the tail latencies of key–value accesses. In this paper, we find that sending out keys in haste makes waste of the chance to select a better replica server certain time later, and suggest to await for a better replica server available when all current replica servers are bad. To realize this idea, we develop the On–Off algorithm, which recognizes bad replica servers according to the feedback information and put them into the OFF state. Special attention is paid on the time interval replica servers stay at the OFF state. Obviously, the On–Off algorithm adds awaiting time at clients, but can greatly reduce the dominated queuing delays at replica servers. In total, the On–Off algorithm improves the 99th latency by about 29% under the default simulation configuration and outperforms the C3 algorithm proposed recently under kinds of scenarios. •In the replica selection of key-value stores, keys can await for a better replica server certain time later, instead of being sent out immediately, especially when all current replica servers are bad.•To reduce the tail latency, we design the On-Off algorithm, which trade the controllable awaiting time at clients for small maximum queuing delays at replica servers.•The On-Off algorithm outperforms C3 under kinds of scenarios and improves the 99th tail latency of key-value accesses by about 29% under the default simulation configuration.
ISSN:0743-7315
1096-0848
DOI:10.1016/j.jpdc.2019.03.017