Dalio: In-Kernel Centralized Replication for Key-Value Stores

Replication is commonly used in distributed key-value stores for high availability. Recent works show that centralized replication provides high throughput through low-overhead write coordination and consistency-aware read forwarding. Unfortunately, they rely on specialized hardware, which is deploy...

Full description

Saved in:
Bibliographic Details
Published inIEICE Transactions on Information and Systems Vol. E108.D; no. 2; pp. 157 - 160
Main Author KIM, Gyuyeong
Format Journal Article
LanguageEnglish
Published Tokyo The Institute of Electronics, Information and Communication Engineers 01.02.2025
Japan Science and Technology Agency
Subjects
Online AccessGet full text
ISSN0916-8532
1745-1361
DOI10.1587/transinf.2024EDL8060

Cover

Abstract Replication is commonly used in distributed key-value stores for high availability. Recent works show that centralized replication provides high throughput through low-overhead write coordination and consistency-aware read forwarding. Unfortunately, they rely on specialized hardware, which is deploy-challenging and poses various limitations. To this end, we present Dalio, a software-based centralized replication system that does not require extra hardware while supporting high throughput. Our key idea is to offload the replication function to per-shard load balancers with eBPF, an emerging kernel-native technique. By building a replication coordinator with eBPF, we can avoid burdensome kernel networking stack overhead. Our experimental results show that Dalio achieves throughput better than the vanilla Linux by up to 2.05× and is comparable to a hardware-based solution.
AbstractList Replication is commonly used in distributed key-value stores for high availability. Recent works show that centralized replication provides high throughput through low-overhead write coordination and consistency-aware read forwarding. Unfortunately, they rely on specialized hardware, which is deploy-challenging and poses various limitations. To this end, we present Dalio, a software-based centralized replication system that does not require extra hardware while supporting high throughput. Our key idea is to offload the replication function to per-shard load balancers with eBPF, an emerging kernel-native technique. By building a replication coordinator with eBPF, we can avoid burdensome kernel networking stack overhead. Our experimental results show that Dalio achieves throughput better than the vanilla Linux by up to 2.05× and is comparable to a hardware-based solution.
ArticleNumber 2024EDL8060
Author KIM, Gyuyeong
Author_xml – sequence: 1
  fullname: KIM, Gyuyeong
  organization: Department of Computer Engineering, Sungshin Women’s University
BookMark eNpNkEtLAzEUhYNUsK3-AxcDrqfm5jHJCC6kDy0tCPWxDen0jk4Zk5pMF_XXO1JtXd2z-L5z4fRIx3mHhFwCHYDU6roJ1sXKlQNGmRiP5ppm9IR0QQmZAs-gQ7o0hyzVkrMz0otxTSloBrJLbke2rvxNMnXpDIPDOhmia_vq6gtXyQI3dVXYpvIuKX1IZrhLX229xeSp8QHjOTktbR3x4vf2yctk_Dx8SOeP99Ph3TwthGJNWix5JnLOJFtaC3mOnCJTFIQtSw5YZlZqxoHCEjLBlJYZguCQ05UEzKnifXK1790E_7nF2Ji13wbXvjQclOJMS5W3lNhTRfAxBizNJlQfNuwMUPMzlPkbyvwbqtUWe20dG_uGB8mGpipqPEpjoNqMDDuEY8kBLt5tMOj4N_Amek0
Cites_doi 10.1145/1807128.1807152
10.1145/3132747.3132764
10.1145/3373376.3378496
10.14778/3523210.3523213
ContentType Journal Article
Copyright 2025 The Institute of Electronics, Information and Communication Engineers
Copyright Japan Science and Technology Agency 2025
Copyright_xml – notice: 2025 The Institute of Electronics, Information and Communication Engineers
– notice: Copyright Japan Science and Technology Agency 2025
DBID AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1587/transinf.2024EDL8060
DatabaseName CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList Computer and Information Systems Abstracts

DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1745-1361
EndPage 160
ExternalDocumentID 10_1587_transinf_2024EDL8060
article_transinf_E108_D_2_E108_D_2024EDL8060_article_char_en
GroupedDBID -~X
5GY
ABJNI
ABZEH
ACGFS
ADNWM
AENEX
ALMA_UNASSIGNED_HOLDINGS
CS3
DU5
EBS
EJD
F5P
ICE
JSF
JSH
KQ8
OK1
P2P
RJT
RZJ
TN5
ZKX
AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c472t-cb36493252baa199e30e27014aff31ef6a5823101b16427856e143190d51e9073
ISSN 0916-8532
IngestDate Mon Jun 30 12:10:00 EDT 2025
Tue Jul 01 02:54:11 EDT 2025
Thu Mar 06 13:50:58 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c472t-cb36493252baa199e30e27014aff31ef6a5823101b16427856e143190d51e9073
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
OpenAccessLink https://www.jstage.jst.go.jp/article/transinf/E108.D/2/E108.D_2024EDL8060/_article/-char/en
PQID 3177328579
PQPubID 2048497
PageCount 4
ParticipantIDs proquest_journals_3177328579
crossref_primary_10_1587_transinf_2024EDL8060
jstage_primary_article_transinf_E108_D_2_E108_D_2024EDL8060_article_char_en
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2025-02-01
PublicationDateYYYYMMDD 2025-02-01
PublicationDate_xml – month: 02
  year: 2025
  text: 2025-02-01
  day: 01
PublicationDecade 2020
PublicationPlace Tokyo
PublicationPlace_xml – name: Tokyo
PublicationTitle IEICE Transactions on Information and Systems
PublicationTitleAlternate IEICE Trans. Inf. & Syst.
PublicationYear 2025
Publisher The Institute of Electronics, Information and Communication Engineers
Japan Science and Technology Agency
Publisher_xml – name: The Institute of Electronics, Information and Communication Engineers
– name: Japan Science and Technology Agency
References [7] “Tofino switch,” https://github.com/barefootnetworks/Open-Tofino, last accessed date: 26 June 2024, 2023.
[6] G. Kim and W. Lee, “In-network leaderless replication for distributed data stores,” Proc. VLDB Endow., vol.15, no.7, pp.1337-1349, Mar. 2022. 10.14778/3523210.3523213
[2] A. Katsarakis, V. Gavrielatos, M.R.S. Katebzadeh, A. Joshi, A. Dragojevic, B. Grot, and V. Nagarajan, “Hermes: A fast, fault-tolerant and linearizable replication protocol,” Proc. of ACM ASPLOS, New York, NY, USA, pp.201-217, 2020. 10.1145/3373376.3378496
[5] J. Terrace and M.J. Freedman, “Object storage on craq: High-throughput chain replication for read-mostly workloads,” Proc. of USENIX ATC, p.11, 2009.
[4] R.V. Renesse and F.B. Schneider, “Chain replication for supporting high throughput and availability,” Proc. of USENIX OSDI, San Francisco, CA, pp.91-104, Dec. 2004.
[10] M. Primorac, K. Argyraki, and E. Bugnion, “When to hedge in interactive services,” Proc. of USENIX NSDI, pp.373-387, April 2021.
[13] “Tommyds c library.” https://www.tommyds.it/, last accessed date: 26 June 2024, 2018.
[11] Y. Zhou, X. Xiang, M. Kiley, S. Dharanipragada, and M. Yu, “DINT: Fast In-Kernel distributed transactions with eBPF,” Proc. of USENIX NSDI, Santa Clara, CA, pp.401-417, April 2024.
[14] X. Jin, X. Li, H. Zhang, R. Soulé, J. Lee, N. Foster, C. Kim, and I. Stoica, “Netcache: Balancing key-value stores with fast in-network caching,” Proc. of ACM SOSP, pp.121-136, 2017. 10.1145/3132747.3132764
[9] “ebpf,” https://ebpf.io/, last accessed date: 26 June 2024, 2024.
[1] P. Hunt, M. Konar, F.P. Junqueira, and B. Reed, “Zookeeper: Wait-free coordination for internet-scale systems,” Proc. of USENIX ATC, USA, p.11, 2010.
[12] B.F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears, “Benchmarking cloud serving systems with ycsb,” Proc. of ACM SoCC, New York, NY, USA, pp.143-154, Association for Computing Machinery, 2010. 10.1145/1807128.1807152
[3] P.A. Alsberg and J.D. Day, “A principle for resilient sharing of distributed resources,” Proc. of ICSE, Washington, DC, USA, pp.562-570, 1976.
[8] Z. Cao, S. Dong, S. Vemuri, and D.H. Du, “Characterizing, modeling, and benchmarking rocksdb key-value workloads at facebook,” Proc. of USENIX FAST, Santa Clara, CA, Feb. 2020.
11
12
13
14
1
2
3
4
5
6
7
8
9
10
References_xml – reference: [4] R.V. Renesse and F.B. Schneider, “Chain replication for supporting high throughput and availability,” Proc. of USENIX OSDI, San Francisco, CA, pp.91-104, Dec. 2004.
– reference: [9] “ebpf,” https://ebpf.io/, last accessed date: 26 June 2024, 2024.
– reference: [2] A. Katsarakis, V. Gavrielatos, M.R.S. Katebzadeh, A. Joshi, A. Dragojevic, B. Grot, and V. Nagarajan, “Hermes: A fast, fault-tolerant and linearizable replication protocol,” Proc. of ACM ASPLOS, New York, NY, USA, pp.201-217, 2020. 10.1145/3373376.3378496
– reference: [1] P. Hunt, M. Konar, F.P. Junqueira, and B. Reed, “Zookeeper: Wait-free coordination for internet-scale systems,” Proc. of USENIX ATC, USA, p.11, 2010.
– reference: [7] “Tofino switch,” https://github.com/barefootnetworks/Open-Tofino, last accessed date: 26 June 2024, 2023.
– reference: [12] B.F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears, “Benchmarking cloud serving systems with ycsb,” Proc. of ACM SoCC, New York, NY, USA, pp.143-154, Association for Computing Machinery, 2010. 10.1145/1807128.1807152
– reference: [11] Y. Zhou, X. Xiang, M. Kiley, S. Dharanipragada, and M. Yu, “DINT: Fast In-Kernel distributed transactions with eBPF,” Proc. of USENIX NSDI, Santa Clara, CA, pp.401-417, April 2024.
– reference: [10] M. Primorac, K. Argyraki, and E. Bugnion, “When to hedge in interactive services,” Proc. of USENIX NSDI, pp.373-387, April 2021.
– reference: [6] G. Kim and W. Lee, “In-network leaderless replication for distributed data stores,” Proc. VLDB Endow., vol.15, no.7, pp.1337-1349, Mar. 2022. 10.14778/3523210.3523213
– reference: [13] “Tommyds c library.” https://www.tommyds.it/, last accessed date: 26 June 2024, 2018.
– reference: [14] X. Jin, X. Li, H. Zhang, R. Soulé, J. Lee, N. Foster, C. Kim, and I. Stoica, “Netcache: Balancing key-value stores with fast in-network caching,” Proc. of ACM SOSP, pp.121-136, 2017. 10.1145/3132747.3132764
– reference: [5] J. Terrace and M.J. Freedman, “Object storage on craq: High-throughput chain replication for read-mostly workloads,” Proc. of USENIX ATC, p.11, 2009.
– reference: [3] P.A. Alsberg and J.D. Day, “A principle for resilient sharing of distributed resources,” Proc. of ICSE, Washington, DC, USA, pp.562-570, 1976.
– reference: [8] Z. Cao, S. Dong, S. Vemuri, and D.H. Du, “Characterizing, modeling, and benchmarking rocksdb key-value workloads at facebook,” Proc. of USENIX FAST, Santa Clara, CA, Feb. 2020.
– ident: 3
– ident: 5
– ident: 4
– ident: 1
– ident: 12
  doi: 10.1145/1807128.1807152
– ident: 11
– ident: 10
– ident: 13
– ident: 14
  doi: 10.1145/3132747.3132764
– ident: 2
  doi: 10.1145/3373376.3378496
– ident: 9
– ident: 7
– ident: 8
– ident: 6
  doi: 10.14778/3523210.3523213
SSID ssj0018215
Score 2.3702233
Snippet Replication is commonly used in distributed key-value stores for high availability. Recent works show that centralized replication provides high throughput...
SourceID proquest
crossref
jstage
SourceType Aggregation Database
Index Database
Publisher
StartPage 157
SubjectTerms Hardware
in-kernel acceleration
networking stacks
Replication
replication protocol
Title Dalio: In-Kernel Centralized Replication for Key-Value Stores
URI https://www.jstage.jst.go.jp/article/transinf/E108.D/2/E108.D_2024EDL8060/_article/-char/en
https://www.proquest.com/docview/3177328579
Volume E108.D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
ispartofPNX IEICE Transactions on Information and Systems, 2025/02/01, Vol.E108.D(2), pp.157-160
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8QwEA6-DnrwLa4vevAa3aRJ03oTt-K6Kggq3krTTUWRXVm7h_XXO2mStj4Q9VJKSIe2M5n5JpkHQvu-AJvDM6a7l_mYgQ3CkiuCI9n3JVhApnKdnHx5FZzdsvN7fl93Gy2zSwp5kL19m1fyH67CGPBVZ8n-gbMVURiAe-AvXIHDcP0VjzsAojURWOW4p0YD9ex2ax_flM47rA6ny2jCnprgu_R5rKMHhyMbPGiBaTfunsS6YYTrHl4eI9iyqoWLWX5t1DfXWrp7We6rT8YTNbQm0O4gUO6CjuutQBJgMNtGKyqjCAXjmPimULrTlDFphwedhlTQhuojptC0taLEdAn4oqC53uI4Lb8GxsE_pyzuXIRtM_1jPexPdqqKHtR-C9BJHJWkQWUazVIhygP73nV9nhRS08vCfalNogQqh9-9yweQMvcEOP3hq7EuEcjNMlq0roN3bORgBU2pwSpacm05PKulV9FCo8bkGjJCcuRVIuI1RMRriIgHrPYqEfGMiKyj29P45uQM254ZOGOCFjiTfsAAk3Mq05REkfLbigrwg9M894nKg5Trg982keAnUxHyQAFiBlTY50RFoO830MxgOFCbOuitr717ntKUs4CEACwzFsFzRAZ5JGQLYfeTkhdTGiX5iTUt1DN_spptF049WwtY0klodVM_XU3W2Yiw-Ftox7EjsUvyNQEwrItPcRFt_fHlttF8vTR20EwxGqtdgJuF3Csl6R2r9HyD
linkProvider Colorado Alliance of Research Libraries
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Dalio%3A+In-Kernel+Centralized+Replication+for+Key-Value+Stores&rft.jtitle=IEICE+transactions+on+information+and+systems&rft.au=KIM%2C+Gyuyeong&rft.date=2025-02-01&rft.issn=0916-8532&rft.eissn=1745-1361&rft.volume=E108.D&rft.issue=2&rft.spage=157&rft.epage=160&rft_id=info:doi/10.1587%2Ftransinf.2024EDL8060&rft.externalDBID=n%2Fa&rft.externalDocID=10_1587_transinf_2024EDL8060
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0916-8532&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0916-8532&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0916-8532&client=summon