Adaptive checkpointing in dynamic grids for uncertain job durations
Adaptive checkpointing is a relatively new approach that is particularly suitable for providing fault-tolerance in dynamic and unstable grid environments. The approach allows for periodic modification of checkpointing intervals at run-time, when additional information becomes available. In this pape...
Saved in:
Published in | Information technology interfaces pp. 585 - 590 |
---|---|
Main Authors | , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
Zagreb
IEEE
01.06.2009
University Computing Centre |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Adaptive checkpointing is a relatively new approach that is particularly suitable for providing fault-tolerance in dynamic and unstable grid environments. The approach allows for periodic modification of checkpointing intervals at run-time, when additional information becomes available. In this paper an adaptive algorithm, named MeanFailureCP+, is introduced that deals with checkpointing of grid applications with execution times that are unknown a priori. The algorithm modifies its parameters, based on dynamically collected feedback on its performance. Simulation results show that the new algorithm performs even better than adaptive approaches that make use of exact information on job execution times. |
---|---|
ISBN: | 9789537138158 9537138151 |
ISSN: | 1330-1012 |
DOI: | 10.1109/ITI.2009.5196152 |