Diskless Checkpointing with Rollback-Dependency Trackability
One way to implement fault tolerant applications is storing its current state in stable memory and, when a failure occurs, restart the application from the last global consistent state. If the number of simultaneous failures is expected to be small a diskless check pointing approach can be used, whe...
Saved in:
Published in | Proceedings - Symposium on Reliable Distributed Systems pp. 275 - 281 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.10.2010
|
Subjects | |
Online Access | Get full text |
ISBN | 9780769542508 0769542506 |
ISSN | 1060-9857 |
DOI | 10.1109/SRDS.2010.17 |
Cover
Abstract | One way to implement fault tolerant applications is storing its current state in stable memory and, when a failure occurs, restart the application from the last global consistent state. If the number of simultaneous failures is expected to be small a diskless check pointing approach can be used, where a failed process's state can be determined only accessing non-faulty process's memory. In the literature diskless check pointing is usually based on synchronous protocols or properties of the application. In this paper we present a quasi-synchronous diskless check pointing algorithm, called RDT-Diskless, based on Rollback-Dependency Track ability. The proposed algorithm includes a garbage collection approach that limits the number of checkpoints that must be kept in memory. A framework, called Cheops, was developed and experimental results were obtained from a commercial cloud environment. |
---|---|
AbstractList | One way to implement fault tolerant applications is storing its current state in stable memory and, when a failure occurs, restart the application from the last global consistent state. If the number of simultaneous failures is expected to be small a diskless check pointing approach can be used, where a failed process's state can be determined only accessing non-faulty process's memory. In the literature diskless check pointing is usually based on synchronous protocols or properties of the application. In this paper we present a quasi-synchronous diskless check pointing algorithm, called RDT-Diskless, based on Rollback-Dependency Track ability. The proposed algorithm includes a garbage collection approach that limits the number of checkpoints that must be kept in memory. A framework, called Cheops, was developed and experimental results were obtained from a commercial cloud environment. |
Author | Garcia, I C Menderico, R M |
Author_xml | – sequence: 1 givenname: R M surname: Menderico fullname: Menderico, R M email: rmm@ic.unicamp.br organization: Inst. of Comput., State Univ. of Campinas (UNICAMP), Campinas, Brazil – sequence: 2 givenname: I C surname: Garcia fullname: Garcia, I C email: islene@ic.unicamp.br organization: Inst. of Comput., State Univ. of Campinas (UNICAMP), Campinas, Brazil |
BookMark | eNotjMtKw0AUQAesYK3ZuXOTH0i9M3ee4EZSq0JBaOu6ZB6xQ2ISMgHJ36vo6nDO4lyTRdd3gZBbCmtKwdwf9pvDmsGvqguSGaVBSSM4E6AXZElBQmG0UFckSylaYFJJzRguycMmpqYNKeXlObhm6GM3xe4j_4rTOd_3bWsr1xSbMITOh87N-XH8CZWNbZzmG3JZV20K2T9X5H37dCxfit3b82v5uCsiVWIq0EqmvENuJAjL0VLU1COTvOYSDdpao_aUW-FMhaC88ShE4MKBEBoVrsjd3zeGEE7DGD-rcT4JyZAD4DcbPUi8 |
ContentType | Conference Proceeding |
DBID | 6IE 6IH CBEJK RIE RIO |
DOI | 10.1109/SRDS.2010.17 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EndPage | 281 |
ExternalDocumentID | 5623400 |
Genre | orig-research |
GroupedDBID | 23M 29P 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR AAWTH ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IPLJI M43 OCL RIE RIL RIO RNS |
ID | FETCH-LOGICAL-i175t-3b627dc349605b43b1381d3264f46393bf838d14b5c9a307d9d355e45c0558373 |
IEDL.DBID | RIE |
ISBN | 9780769542508 0769542506 |
ISSN | 1060-9857 |
IngestDate | Wed Aug 27 02:50:51 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i175t-3b627dc349605b43b1381d3264f46393bf838d14b5c9a307d9d355e45c0558373 |
PageCount | 7 |
ParticipantIDs | ieee_primary_5623400 |
PublicationCentury | 2000 |
PublicationDate | 2010-Oct. |
PublicationDateYYYYMMDD | 2010-10-01 |
PublicationDate_xml | – month: 10 year: 2010 text: 2010-Oct. |
PublicationDecade | 2010 |
PublicationTitle | Proceedings - Symposium on Reliable Distributed Systems |
PublicationTitleAbbrev | SRDS |
PublicationYear | 2010 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssib026768223 ssj0020387 |
Score | 1.748122 |
Snippet | One way to implement fault tolerant applications is storing its current state in stable memory and, when a failure occurs, restart the application from the... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 275 |
SubjectTerms | availability Checkpointing Clouds dependability distributed algorithms Fault tolerance Fault tolerant systems Protocols Servers Synchronization |
Title | Diskless Checkpointing with Rollback-Dependency Trackability |
URI | https://ieeexplore.ieee.org/document/5623400 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEJ4gJ0-oYHynB48ulHZ3aRNvICEmGCOScCOd3W0kNUC0HPDXO7ttIRoP3tomfex2duabnZlvAG61nybcVSwHGDIuImTk2YZM8bRHAF2icfUV4yc5mvLHmZjV4G5XC2OMcclnpm0PXSxfr9TGbpV1rK2mBx_AAYlZUatVyU4gCTcHey0c2LCsi3RKn94ueoXLHguSUV-WzDvVebTLiI87k5fBpMj46v7suOIMzrAB4-pTizyTrL3Jsa2-frE4_ncsR9Dal_Z5zzujdQw1szyBRtXbwSuXehPuB4vP7J3UoNd_MypbrxaupYRn9209y-SNicrYoGyhq7YeGT2VFaTf2xZMhw-v_RErOy2wBcGHnIUog55WljzeF8hD7JIh14TseMoJwoSYRmGkuxyFihPSCjrWhFMMF8oXglzc8BTqy9XSnIFHt5JXnXKMEklgMMAITUw4J1GJouWO59C0kzFfF2Qa83IeLv6-fAmHLlzvsueuoJ5_bMw1oYAcb9zv_wZ-uqjb |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8JAEJ0gHvSECsZve_DoYml3S5t4AwkqECOQcCOd7TaSGiBaDvjrnd22EI0Hb22Tfm1m973ZmXkDcBPZcchNxbKDLuPCR0aercskj5tE0D1Upr6iP_C6Y_40EZMS3G5qYZRSJvlM1fWhieVHC7nSW2V3GqvpwTuwS7jPRVatVViP4xFzdrbrsKMDsybW6dn0ftHMnPZAkJXaXq69U5z7m5z44G742h5mOV-Nnz1XDOR0KtAvPjbLNEnqqxTr8uuXjuN__-YAatviPutlA1uHUFLzI6gU3R2sfLJX4b49-0zeaSG0Wm9KJsvFzDSVsPTOraW1vDGUCWvnTXTl2iLYk0km-72uwbjzMGp1Wd5rgc2IQKTMRc9pRlLLx9sCuYsNgvKIuB2POZEYF2Pf9aMGRyGDkNaFKIiIqSgupC0EObnuMZTni7k6AYtuJb865uiHHtFBB31UATGdUIaSJjyeQlUPxnSZyWlM83E4-_vyNex1R_3etPc4eD6HfRO8N7l0F1BOP1bqkjhBilfGFL4Bj-2sKA |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+-+Symposium+on+Reliable+Distributed+Systems&rft.atitle=Diskless+Checkpointing+with+Rollback-Dependency+Trackability&rft.au=Menderico%2C+R+M&rft.au=Garcia%2C+I+C&rft.date=2010-10-01&rft.pub=IEEE&rft.isbn=9780769542508&rft.issn=1060-9857&rft.spage=275&rft.epage=281&rft_id=info:doi/10.1109%2FSRDS.2010.17&rft.externalDocID=5623400 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1060-9857&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1060-9857&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1060-9857&client=summon |