Provenance Services for Distributed Workflows

Scientific experiments using workflows benefit from mechanisms to trace the generation of results. As workflows start to scale it is fundamental to have access to their underlying processes, parameters and data. Particularly in molecular dynamics (MD) simulations, a study of the interatomic interact...

Full description

Saved in:
Bibliographic Details
Published in2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID) pp. 526 - 533
Main Authors da Cruz, S.M.S., Barros, P.M., Bisch, P.M., Campos, M.L.M., Mattoso, M.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.05.2008
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Scientific experiments using workflows benefit from mechanisms to trace the generation of results. As workflows start to scale it is fundamental to have access to their underlying processes, parameters and data. Particularly in molecular dynamics (MD) simulations, a study of the interatomic interactions in proteins must use distributed high performance computing environments to produce timely results. Scientist's trust in experiments produced by gathering distributed partial results may be limited without provenance information. This paper presents a service architecture that captures and stores provenance data from distributed, autonomous, replicated and heterogeneous resources. Such provenance data can be used to trace the history of the distributed execution process. These services can be coupled to workflow management systems. The Kepler system was used as a basis to manage a grid workflow application. Experimental results regarding cluster and grid MD simulations were evaluated using the provenance services architecture.
DOI:10.1109/CCGRID.2008.73