Diskless Checkpointing with Rollback-Dependency Trackability

One way to implement fault tolerant applications is storing its current state in stable memory and, when a failure occurs, restart the application from the last global consistent state. If the number of simultaneous failures is expected to be small a diskless check pointing approach can be used, whe...

Full description

Saved in:
Bibliographic Details
Published inProceedings - Symposium on Reliable Distributed Systems pp. 275 - 281
Main Authors Menderico, R M, Garcia, I C
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.10.2010
Subjects
Online AccessGet full text
ISBN9780769542508
0769542506
ISSN1060-9857
DOI10.1109/SRDS.2010.17

Cover

Abstract One way to implement fault tolerant applications is storing its current state in stable memory and, when a failure occurs, restart the application from the last global consistent state. If the number of simultaneous failures is expected to be small a diskless check pointing approach can be used, where a failed process's state can be determined only accessing non-faulty process's memory. In the literature diskless check pointing is usually based on synchronous protocols or properties of the application. In this paper we present a quasi-synchronous diskless check pointing algorithm, called RDT-Diskless, based on Rollback-Dependency Track ability. The proposed algorithm includes a garbage collection approach that limits the number of checkpoints that must be kept in memory. A framework, called Cheops, was developed and experimental results were obtained from a commercial cloud environment.
AbstractList One way to implement fault tolerant applications is storing its current state in stable memory and, when a failure occurs, restart the application from the last global consistent state. If the number of simultaneous failures is expected to be small a diskless check pointing approach can be used, where a failed process's state can be determined only accessing non-faulty process's memory. In the literature diskless check pointing is usually based on synchronous protocols or properties of the application. In this paper we present a quasi-synchronous diskless check pointing algorithm, called RDT-Diskless, based on Rollback-Dependency Track ability. The proposed algorithm includes a garbage collection approach that limits the number of checkpoints that must be kept in memory. A framework, called Cheops, was developed and experimental results were obtained from a commercial cloud environment.
Author Garcia, I C
Menderico, R M
Author_xml – sequence: 1
  givenname: R M
  surname: Menderico
  fullname: Menderico, R M
  email: rmm@ic.unicamp.br
  organization: Inst. of Comput., State Univ. of Campinas (UNICAMP), Campinas, Brazil
– sequence: 2
  givenname: I C
  surname: Garcia
  fullname: Garcia, I C
  email: islene@ic.unicamp.br
  organization: Inst. of Comput., State Univ. of Campinas (UNICAMP), Campinas, Brazil
BookMark eNotjMtKw0AUQAesYK3ZuXOTH0i9M3ee4EZSq0JBaOu6ZB6xQ2ISMgHJ36vo6nDO4lyTRdd3gZBbCmtKwdwf9pvDmsGvqguSGaVBSSM4E6AXZElBQmG0UFckSylaYFJJzRguycMmpqYNKeXlObhm6GM3xe4j_4rTOd_3bWsr1xSbMITOh87N-XH8CZWNbZzmG3JZV20K2T9X5H37dCxfit3b82v5uCsiVWIq0EqmvENuJAjL0VLU1COTvOYSDdpao_aUW-FMhaC88ShE4MKBEBoVrsjd3zeGEE7DGD-rcT4JyZAD4DcbPUi8
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/SRDS.2010.17
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EndPage 281
ExternalDocumentID 5623400
Genre orig-research
GroupedDBID 23M
29P
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-i175t-3b627dc349605b43b1381d3264f46393bf838d14b5c9a307d9d355e45c0558373
IEDL.DBID RIE
ISBN 9780769542508
0769542506
ISSN 1060-9857
IngestDate Wed Aug 27 02:50:51 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-3b627dc349605b43b1381d3264f46393bf838d14b5c9a307d9d355e45c0558373
PageCount 7
ParticipantIDs ieee_primary_5623400
PublicationCentury 2000
PublicationDate 2010-Oct.
PublicationDateYYYYMMDD 2010-10-01
PublicationDate_xml – month: 10
  year: 2010
  text: 2010-Oct.
PublicationDecade 2010
PublicationTitle Proceedings - Symposium on Reliable Distributed Systems
PublicationTitleAbbrev SRDS
PublicationYear 2010
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib026768223
ssj0020387
Score 1.748122
Snippet One way to implement fault tolerant applications is storing its current state in stable memory and, when a failure occurs, restart the application from the...
SourceID ieee
SourceType Publisher
StartPage 275
SubjectTerms availability
Checkpointing
Clouds
dependability
distributed algorithms
Fault tolerance
Fault tolerant systems
Protocols
Servers
Synchronization
Title Diskless Checkpointing with Rollback-Dependency Trackability
URI https://ieeexplore.ieee.org/document/5623400
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEJ4gJ0-oYHynB48ulHZ3aRNvICEmGCOScCOd3W0kNUC0HPDXO7ttIRoP3tomfex2duabnZlvAG61nybcVSwHGDIuImTk2YZM8bRHAF2icfUV4yc5mvLHmZjV4G5XC2OMcclnpm0PXSxfr9TGbpV1rK2mBx_AAYlZUatVyU4gCTcHey0c2LCsi3RKn94ueoXLHguSUV-WzDvVebTLiI87k5fBpMj46v7suOIMzrAB4-pTizyTrL3Jsa2-frE4_ncsR9Dal_Z5zzujdQw1szyBRtXbwSuXehPuB4vP7J3UoNd_MypbrxaupYRn9209y-SNicrYoGyhq7YeGT2VFaTf2xZMhw-v_RErOy2wBcGHnIUog55WljzeF8hD7JIh14TseMoJwoSYRmGkuxyFihPSCjrWhFMMF8oXglzc8BTqy9XSnIFHt5JXnXKMEklgMMAITUw4J1GJouWO59C0kzFfF2Qa83IeLv6-fAmHLlzvsueuoJ5_bMw1oYAcb9zv_wZ-uqjb
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8JAEJ0gHvSECsZve_DoYml3S5t4AwkqECOQcCOd7TaSGiBaDvjrnd22EI0Hb22Tfm1m973ZmXkDcBPZcchNxbKDLuPCR0aercskj5tE0D1Upr6iP_C6Y_40EZMS3G5qYZRSJvlM1fWhieVHC7nSW2V3GqvpwTuwS7jPRVatVViP4xFzdrbrsKMDsybW6dn0ftHMnPZAkJXaXq69U5z7m5z44G742h5mOV-Nnz1XDOR0KtAvPjbLNEnqqxTr8uuXjuN__-YAatviPutlA1uHUFLzI6gU3R2sfLJX4b49-0zeaSG0Wm9KJsvFzDSVsPTOraW1vDGUCWvnTXTl2iLYk0km-72uwbjzMGp1Wd5rgc2IQKTMRc9pRlLLx9sCuYsNgvKIuB2POZEYF2Pf9aMGRyGDkNaFKIiIqSgupC0EObnuMZTni7k6AYtuJb865uiHHtFBB31UATGdUIaSJjyeQlUPxnSZyWlM83E4-_vyNex1R_3etPc4eD6HfRO8N7l0F1BOP1bqkjhBilfGFL4Bj-2sKA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+-+Symposium+on+Reliable+Distributed+Systems&rft.atitle=Diskless+Checkpointing+with+Rollback-Dependency+Trackability&rft.au=Menderico%2C+R+M&rft.au=Garcia%2C+I+C&rft.date=2010-10-01&rft.pub=IEEE&rft.isbn=9780769542508&rft.issn=1060-9857&rft.spage=275&rft.epage=281&rft_id=info:doi/10.1109%2FSRDS.2010.17&rft.externalDocID=5623400
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1060-9857&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1060-9857&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1060-9857&client=summon