Real-Time LSM-Trees for HTAP Workloads
Real-time analytics systems employ hybrid data layouts in which data are stored in different formats throughout their lifecycle. Recent data are stored in a row-oriented format to serve OLTP workloads and support high insert rates, while older data are transformed to a column-oriented format for OLA...
Saved in:
Published in | 2023 IEEE 39th International Conference on Data Engineering (ICDE) pp. 1208 - 1220 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.04.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Real-time analytics systems employ hybrid data layouts in which data are stored in different formats throughout their lifecycle. Recent data are stored in a row-oriented format to serve OLTP workloads and support high insert rates, while older data are transformed to a column-oriented format for OLAP access patterns. We observe that a Log-Structured Merge (LSM) Tree is a natural fit for a lifecycle-aware storage engine due to its high write throughput and level-oriented structure, in which records propagate from one level to the next over time. To build a lifecycle-aware storage engine using an LSM-Tree, we make a crucial modification to allow different data layouts in different levels, ranging from purely row-oriented to purely column-oriented, leading to a Real-Time LSM-Tree. We give a cost model and an algorithm to design a Real-Time LSM-Tree that is suitable for a given workload, followed by an experimental evaluation of LASER - a prototype implementation of our idea built on top of the RocksDB key-value store. |
---|---|
AbstractList | Real-time analytics systems employ hybrid data layouts in which data are stored in different formats throughout their lifecycle. Recent data are stored in a row-oriented format to serve OLTP workloads and support high insert rates, while older data are transformed to a column-oriented format for OLAP access patterns. We observe that a Log-Structured Merge (LSM) Tree is a natural fit for a lifecycle-aware storage engine due to its high write throughput and level-oriented structure, in which records propagate from one level to the next over time. To build a lifecycle-aware storage engine using an LSM-Tree, we make a crucial modification to allow different data layouts in different levels, ranging from purely row-oriented to purely column-oriented, leading to a Real-Time LSM-Tree. We give a cost model and an algorithm to design a Real-Time LSM-Tree that is suitable for a given workload, followed by an experimental evaluation of LASER - a prototype implementation of our idea built on top of the RocksDB key-value store. |
Author | Ilyas, Ihab F. Golab, Lukasz Idreos, Stratos Saxena, Hemant |
Author_xml | – sequence: 1 givenname: Hemant surname: Saxena fullname: Saxena, Hemant email: h.saxena@sap.com organization: SAP Labs,Waterloo – sequence: 2 givenname: Lukasz surname: Golab fullname: Golab, Lukasz email: lgolab@uwaterloo.ca organization: University of Waterloo – sequence: 3 givenname: Stratos surname: Idreos fullname: Idreos, Stratos email: stratos@seas.harvard.edu organization: Harvard University – sequence: 4 givenname: Ihab F. surname: Ilyas fullname: Ilyas, Ihab F. email: ilyas@uwaterloo.ca organization: University of Waterloo |
BookMark | eNotjrFOwzAUAA0CibbkDzpkYnP6nu0Xx2MVCq0UBIIg2CrHfpECaYMSFv4eEEy3nE43F2fH4chCLBEyRHCrXXm9ISKkTIHSGQA4eyISZ12hCbRSyrpTMVPakgSVv16IZJre4NcziAQzcfXIvpd1d-C0erqT9cg8pe0wptt6_ZC-DON7P_g4XYrz1vcTJ_9ciOebTV1uZXV_uyvXlewQ3ae0mj0WiBxUTt6roJ1pSQXjfmZiMK2H4L1uiphrH9toCIibognodIzQ6IVY_nU7Zt5_jN3Bj197BCyMtUZ_A-g1QlU |
CODEN | IEEPAD |
ContentType | Conference Proceeding |
DBID | 6IE 6IH CBEJK RIE RIO |
DOI | 10.1109/ICDE55515.2023.00097 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore IEEE Proceedings Order Plans (POP) 1998-present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISBN | 9798350322279 |
EISSN | 2375-026X |
EndPage | 1220 |
ExternalDocumentID | 10184774 |
Genre | orig-research |
GroupedDBID | 6IE 6IH CBEJK RIE RIO |
ID | FETCH-LOGICAL-i119t-73ea1811ec265aa2c394f52c49032dc4fa0caa3b8d63adfd4505eb8bc193dd0b3 |
IEDL.DBID | RIE |
IngestDate | Wed Jun 26 19:25:30 EDT 2024 |
IsPeerReviewed | false |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i119t-73ea1811ec265aa2c394f52c49032dc4fa0caa3b8d63adfd4505eb8bc193dd0b3 |
PageCount | 13 |
ParticipantIDs | ieee_primary_10184774 |
PublicationCentury | 2000 |
PublicationDate | 2023-April |
PublicationDateYYYYMMDD | 2023-04-01 |
PublicationDate_xml | – month: 04 year: 2023 text: 2023-April |
PublicationDecade | 2020 |
PublicationTitle | 2023 IEEE 39th International Conference on Data Engineering (ICDE) |
PublicationTitleAbbrev | ICDE |
PublicationYear | 2023 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0000941150 |
Score | 2.2654583 |
Snippet | Real-time analytics systems employ hybrid data layouts in which data are stored in different formats throughout their lifecycle. Recent data are stored in a... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 1208 |
SubjectTerms | Costs Data engineering HTAP Laser modes Layout LSM Trees Prototypes Real-time systems Storage Throughput |
Title | Real-Time LSM-Trees for HTAP Workloads |
URI | https://ieeexplore.ieee.org/document/10184774 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8IwGG6Ekyf8wPidHYy3jm5rR3s0CEEjhCgk3Eg_3iZGwwyMi7_ethsYTUy8Lb2s7do-T9-9z_sgdOMwRFvls8TdisEUjMaK5wInxgqZM9O1wb5tNM6HM_o4Z_NarB60MAAQks8g9o_hX74p9MaHyjq-uhR1fKWBGpyklVhrF1Bx9xTPbmp5XEJE56F332eOEbDYe4THQbPww0QlYMighcbbt1epI2_xplSx_vxVmPHf3TtA7W-5XjTZAdEh2oPlEWpt_Rqievseo9tnxwqxF31ETy8jPF0BrCNHWqPh9G4S-bD5eyHNuo1mg_60N8S1TwJ-TRJR4m4G0gF1AjrNmZSpzgS1LNVUkCw1mlpJtJSZ4ibPpLGGOtYDiivtyJsxRGUnqLkslnCKIiJSCSzlhGpLlVbSHQAKut5G0xLB-Rlq-3EvPqpSGIvtkM__aL9A-37uq1SXS9QsVxu4cihequvw9b4AuoKZ6A |
link.rule.ids | 310,311,786,790,795,796,802,27958,55109 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bT8IwFD5RfNAnvGC8uwfjW8cu3VgfDUKGAiE6Et5IL6cJ0YCB8eKvt90GRhMT35ZuSdt02_f19HznA7gzGCK1sFni5o0hFJUkIokZ8ZVmPI5USxf2bYNhnI7p0ySaVGL1QguDiEXyGbr2sjjLVwu5tqGypq0uRQ1f2YU9A_QeK-Va25CK2alYflMJ5Mz9Zq_92InMo5FrXcLdQrXww0alQJFuHYab_svkkTd3nQtXfv4qzfjvAR5C41uw54y2UHQEOzg_hvrGscGpPuATuH8xvJBY2YfTfx2QbIm4cgxtddLsYeTYwPn7gqtVA8bdTtZOSeWUQGa-z3LSCpEbqPZRBnHEeSBDRnUUSMq8MFCSau5JzkORqDjkSitqeA-KREhD35TyRHgKtflijmfgeCzgGAWJR6WmQgpufgECW9ZIU3ssSc6hYec9_SiLYUw3U774o_0W9tNs0J_2e8PnSziw61AmvlxBLV-u8dpgei5uipX8AhDenT4 |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+IEEE+39th+International+Conference+on+Data+Engineering+%28ICDE%29&rft.atitle=Real-Time+LSM-Trees+for+HTAP+Workloads&rft.au=Saxena%2C+Hemant&rft.au=Golab%2C+Lukasz&rft.au=Idreos%2C+Stratos&rft.au=Ilyas%2C+Ihab+F.&rft.date=2023-04-01&rft.pub=IEEE&rft.eissn=2375-026X&rft.spage=1208&rft.epage=1220&rft_id=info:doi/10.1109%2FICDE55515.2023.00097&rft.externalDocID=10184774 |