Dalio: In-Kernel Centralized Replication for Key-Value Stores
Replication is commonly used in distributed key-value stores for high availability. Recent works show that centralized replication provides high throughput through low-overhead write coordination and consistency-aware read forwarding. Unfortunately, they rely on specialized hardware, which is deploy...
Saved in:
Published in | IEICE Transactions on Information and Systems Vol. E108.D; no. 2; pp. 157 - 160 |
---|---|
Main Author | |
Format | Journal Article |
Language | English |
Published |
Tokyo
The Institute of Electronics, Information and Communication Engineers
01.02.2025
Japan Science and Technology Agency |
Subjects | |
Online Access | Get full text |
ISSN | 0916-8532 1745-1361 |
DOI | 10.1587/transinf.2024EDL8060 |
Cover
Abstract | Replication is commonly used in distributed key-value stores for high availability. Recent works show that centralized replication provides high throughput through low-overhead write coordination and consistency-aware read forwarding. Unfortunately, they rely on specialized hardware, which is deploy-challenging and poses various limitations. To this end, we present Dalio, a software-based centralized replication system that does not require extra hardware while supporting high throughput. Our key idea is to offload the replication function to per-shard load balancers with eBPF, an emerging kernel-native technique. By building a replication coordinator with eBPF, we can avoid burdensome kernel networking stack overhead. Our experimental results show that Dalio achieves throughput better than the vanilla Linux by up to 2.05× and is comparable to a hardware-based solution. |
---|---|
AbstractList | Replication is commonly used in distributed key-value stores for high availability. Recent works show that centralized replication provides high throughput through low-overhead write coordination and consistency-aware read forwarding. Unfortunately, they rely on specialized hardware, which is deploy-challenging and poses various limitations. To this end, we present Dalio, a software-based centralized replication system that does not require extra hardware while supporting high throughput. Our key idea is to offload the replication function to per-shard load balancers with eBPF, an emerging kernel-native technique. By building a replication coordinator with eBPF, we can avoid burdensome kernel networking stack overhead. Our experimental results show that Dalio achieves throughput better than the vanilla Linux by up to 2.05× and is comparable to a hardware-based solution. |
ArticleNumber | 2024EDL8060 |
Author | KIM, Gyuyeong |
Author_xml | – sequence: 1 fullname: KIM, Gyuyeong organization: Department of Computer Engineering, Sungshin Women’s University |
BookMark | eNpNkEtLAzEUhYNUsK3-AxcDrqfm5jHJCC6kDy0tCPWxDen0jk4Zk5pMF_XXO1JtXd2z-L5z4fRIx3mHhFwCHYDU6roJ1sXKlQNGmRiP5ppm9IR0QQmZAs-gQ7o0hyzVkrMz0otxTSloBrJLbke2rvxNMnXpDIPDOhmia_vq6gtXyQI3dVXYpvIuKX1IZrhLX229xeSp8QHjOTktbR3x4vf2yctk_Dx8SOeP99Ph3TwthGJNWix5JnLOJFtaC3mOnCJTFIQtSw5YZlZqxoHCEjLBlJYZguCQ05UEzKnifXK1790E_7nF2Ji13wbXvjQclOJMS5W3lNhTRfAxBizNJlQfNuwMUPMzlPkbyvwbqtUWe20dG_uGB8mGpipqPEpjoNqMDDuEY8kBLt5tMOj4N_Amek0 |
Cites_doi | 10.1145/1807128.1807152 10.1145/3132747.3132764 10.1145/3373376.3378496 10.14778/3523210.3523213 |
ContentType | Journal Article |
Copyright | 2025 The Institute of Electronics, Information and Communication Engineers Copyright Japan Science and Technology Agency 2025 |
Copyright_xml | – notice: 2025 The Institute of Electronics, Information and Communication Engineers – notice: Copyright Japan Science and Technology Agency 2025 |
DBID | AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
DOI | 10.1587/transinf.2024EDL8060 |
DatabaseName | CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Computer and Information Systems Abstracts |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Computer Science |
EISSN | 1745-1361 |
EndPage | 160 |
ExternalDocumentID | 10_1587_transinf_2024EDL8060 article_transinf_E108_D_2_E108_D_2024EDL8060_article_char_en |
GroupedDBID | -~X 5GY ABJNI ABZEH ACGFS ADNWM AENEX ALMA_UNASSIGNED_HOLDINGS CS3 DU5 EBS EJD F5P ICE JSF JSH KQ8 OK1 P2P RJT RZJ TN5 ZKX AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c472t-cb36493252baa199e30e27014aff31ef6a5823101b16427856e143190d51e9073 |
ISSN | 0916-8532 |
IngestDate | Mon Jun 30 12:10:00 EDT 2025 Tue Jul 01 02:54:11 EDT 2025 Thu Mar 06 13:50:58 EST 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 2 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c472t-cb36493252baa199e30e27014aff31ef6a5823101b16427856e143190d51e9073 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
OpenAccessLink | https://www.jstage.jst.go.jp/article/transinf/E108.D/2/E108.D_2024EDL8060/_article/-char/en |
PQID | 3177328579 |
PQPubID | 2048497 |
PageCount | 4 |
ParticipantIDs | proquest_journals_3177328579 crossref_primary_10_1587_transinf_2024EDL8060 jstage_primary_article_transinf_E108_D_2_E108_D_2024EDL8060_article_char_en |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2025-02-01 |
PublicationDateYYYYMMDD | 2025-02-01 |
PublicationDate_xml | – month: 02 year: 2025 text: 2025-02-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | Tokyo |
PublicationPlace_xml | – name: Tokyo |
PublicationTitle | IEICE Transactions on Information and Systems |
PublicationTitleAlternate | IEICE Trans. Inf. & Syst. |
PublicationYear | 2025 |
Publisher | The Institute of Electronics, Information and Communication Engineers Japan Science and Technology Agency |
Publisher_xml | – name: The Institute of Electronics, Information and Communication Engineers – name: Japan Science and Technology Agency |
References | [7] “Tofino switch,” https://github.com/barefootnetworks/Open-Tofino, last accessed date: 26 June 2024, 2023. [6] G. Kim and W. Lee, “In-network leaderless replication for distributed data stores,” Proc. VLDB Endow., vol.15, no.7, pp.1337-1349, Mar. 2022. 10.14778/3523210.3523213 [2] A. Katsarakis, V. Gavrielatos, M.R.S. Katebzadeh, A. Joshi, A. Dragojevic, B. Grot, and V. Nagarajan, “Hermes: A fast, fault-tolerant and linearizable replication protocol,” Proc. of ACM ASPLOS, New York, NY, USA, pp.201-217, 2020. 10.1145/3373376.3378496 [5] J. Terrace and M.J. Freedman, “Object storage on craq: High-throughput chain replication for read-mostly workloads,” Proc. of USENIX ATC, p.11, 2009. [4] R.V. Renesse and F.B. Schneider, “Chain replication for supporting high throughput and availability,” Proc. of USENIX OSDI, San Francisco, CA, pp.91-104, Dec. 2004. [10] M. Primorac, K. Argyraki, and E. Bugnion, “When to hedge in interactive services,” Proc. of USENIX NSDI, pp.373-387, April 2021. [13] “Tommyds c library.” https://www.tommyds.it/, last accessed date: 26 June 2024, 2018. [11] Y. Zhou, X. Xiang, M. Kiley, S. Dharanipragada, and M. Yu, “DINT: Fast In-Kernel distributed transactions with eBPF,” Proc. of USENIX NSDI, Santa Clara, CA, pp.401-417, April 2024. [14] X. Jin, X. Li, H. Zhang, R. Soulé, J. Lee, N. Foster, C. Kim, and I. Stoica, “Netcache: Balancing key-value stores with fast in-network caching,” Proc. of ACM SOSP, pp.121-136, 2017. 10.1145/3132747.3132764 [9] “ebpf,” https://ebpf.io/, last accessed date: 26 June 2024, 2024. [1] P. Hunt, M. Konar, F.P. Junqueira, and B. Reed, “Zookeeper: Wait-free coordination for internet-scale systems,” Proc. of USENIX ATC, USA, p.11, 2010. [12] B.F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears, “Benchmarking cloud serving systems with ycsb,” Proc. of ACM SoCC, New York, NY, USA, pp.143-154, Association for Computing Machinery, 2010. 10.1145/1807128.1807152 [3] P.A. Alsberg and J.D. Day, “A principle for resilient sharing of distributed resources,” Proc. of ICSE, Washington, DC, USA, pp.562-570, 1976. [8] Z. Cao, S. Dong, S. Vemuri, and D.H. Du, “Characterizing, modeling, and benchmarking rocksdb key-value workloads at facebook,” Proc. of USENIX FAST, Santa Clara, CA, Feb. 2020. 11 12 13 14 1 2 3 4 5 6 7 8 9 10 |
References_xml | – reference: [4] R.V. Renesse and F.B. Schneider, “Chain replication for supporting high throughput and availability,” Proc. of USENIX OSDI, San Francisco, CA, pp.91-104, Dec. 2004. – reference: [9] “ebpf,” https://ebpf.io/, last accessed date: 26 June 2024, 2024. – reference: [2] A. Katsarakis, V. Gavrielatos, M.R.S. Katebzadeh, A. Joshi, A. Dragojevic, B. Grot, and V. Nagarajan, “Hermes: A fast, fault-tolerant and linearizable replication protocol,” Proc. of ACM ASPLOS, New York, NY, USA, pp.201-217, 2020. 10.1145/3373376.3378496 – reference: [1] P. Hunt, M. Konar, F.P. Junqueira, and B. Reed, “Zookeeper: Wait-free coordination for internet-scale systems,” Proc. of USENIX ATC, USA, p.11, 2010. – reference: [7] “Tofino switch,” https://github.com/barefootnetworks/Open-Tofino, last accessed date: 26 June 2024, 2023. – reference: [12] B.F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears, “Benchmarking cloud serving systems with ycsb,” Proc. of ACM SoCC, New York, NY, USA, pp.143-154, Association for Computing Machinery, 2010. 10.1145/1807128.1807152 – reference: [11] Y. Zhou, X. Xiang, M. Kiley, S. Dharanipragada, and M. Yu, “DINT: Fast In-Kernel distributed transactions with eBPF,” Proc. of USENIX NSDI, Santa Clara, CA, pp.401-417, April 2024. – reference: [10] M. Primorac, K. Argyraki, and E. Bugnion, “When to hedge in interactive services,” Proc. of USENIX NSDI, pp.373-387, April 2021. – reference: [6] G. Kim and W. Lee, “In-network leaderless replication for distributed data stores,” Proc. VLDB Endow., vol.15, no.7, pp.1337-1349, Mar. 2022. 10.14778/3523210.3523213 – reference: [13] “Tommyds c library.” https://www.tommyds.it/, last accessed date: 26 June 2024, 2018. – reference: [14] X. Jin, X. Li, H. Zhang, R. Soulé, J. Lee, N. Foster, C. Kim, and I. Stoica, “Netcache: Balancing key-value stores with fast in-network caching,” Proc. of ACM SOSP, pp.121-136, 2017. 10.1145/3132747.3132764 – reference: [5] J. Terrace and M.J. Freedman, “Object storage on craq: High-throughput chain replication for read-mostly workloads,” Proc. of USENIX ATC, p.11, 2009. – reference: [3] P.A. Alsberg and J.D. Day, “A principle for resilient sharing of distributed resources,” Proc. of ICSE, Washington, DC, USA, pp.562-570, 1976. – reference: [8] Z. Cao, S. Dong, S. Vemuri, and D.H. Du, “Characterizing, modeling, and benchmarking rocksdb key-value workloads at facebook,” Proc. of USENIX FAST, Santa Clara, CA, Feb. 2020. – ident: 3 – ident: 5 – ident: 4 – ident: 1 – ident: 12 doi: 10.1145/1807128.1807152 – ident: 11 – ident: 10 – ident: 13 – ident: 14 doi: 10.1145/3132747.3132764 – ident: 2 doi: 10.1145/3373376.3378496 – ident: 9 – ident: 7 – ident: 8 – ident: 6 doi: 10.14778/3523210.3523213 |
SSID | ssj0018215 |
Score | 2.3702233 |
Snippet | Replication is commonly used in distributed key-value stores for high availability. Recent works show that centralized replication provides high throughput... |
SourceID | proquest crossref jstage |
SourceType | Aggregation Database Index Database Publisher |
StartPage | 157 |
SubjectTerms | Hardware in-kernel acceleration networking stacks Replication replication protocol |
Title | Dalio: In-Kernel Centralized Replication for Key-Value Stores |
URI | https://www.jstage.jst.go.jp/article/transinf/E108.D/2/E108.D_2024EDL8060/_article/-char/en https://www.proquest.com/docview/3177328579 |
Volume | E108.D |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
ispartofPNX | IEICE Transactions on Information and Systems, 2025/02/01, Vol.E108.D(2), pp.157-160 |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8QwEA6-DnrwLa4vevAa3aRJ03oTt-K6Kggq3krTTUWRXVm7h_XXO2mStj4Q9VJKSIe2M5n5JpkHQvu-AJvDM6a7l_mYgQ3CkiuCI9n3JVhApnKdnHx5FZzdsvN7fl93Gy2zSwp5kL19m1fyH67CGPBVZ8n-gbMVURiAe-AvXIHDcP0VjzsAojURWOW4p0YD9ex2ax_flM47rA6ny2jCnprgu_R5rKMHhyMbPGiBaTfunsS6YYTrHl4eI9iyqoWLWX5t1DfXWrp7We6rT8YTNbQm0O4gUO6CjuutQBJgMNtGKyqjCAXjmPimULrTlDFphwedhlTQhuojptC0taLEdAn4oqC53uI4Lb8GxsE_pyzuXIRtM_1jPexPdqqKHtR-C9BJHJWkQWUazVIhygP73nV9nhRS08vCfalNogQqh9-9yweQMvcEOP3hq7EuEcjNMlq0roN3bORgBU2pwSpacm05PKulV9FCo8bkGjJCcuRVIuI1RMRriIgHrPYqEfGMiKyj29P45uQM254ZOGOCFjiTfsAAk3Mq05REkfLbigrwg9M894nKg5Trg982keAnUxHyQAFiBlTY50RFoO830MxgOFCbOuitr717ntKUs4CEACwzFsFzRAZ5JGQLYfeTkhdTGiX5iTUt1DN_spptF049WwtY0klodVM_XU3W2Yiw-Ftox7EjsUvyNQEwrItPcRFt_fHlttF8vTR20EwxGqtdgJuF3Csl6R2r9HyD |
linkProvider | Colorado Alliance of Research Libraries |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Dalio%3A+In-Kernel+Centralized+Replication+for+Key-Value+Stores&rft.jtitle=IEICE+transactions+on+information+and+systems&rft.au=KIM%2C+Gyuyeong&rft.date=2025-02-01&rft.issn=0916-8532&rft.eissn=1745-1361&rft.volume=E108.D&rft.issue=2&rft.spage=157&rft.epage=160&rft_id=info:doi/10.1587%2Ftransinf.2024EDL8060&rft.externalDBID=n%2Fa&rft.externalDocID=10_1587_transinf_2024EDL8060 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0916-8532&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0916-8532&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0916-8532&client=summon |