Towards Automated Design, Analysis and Optimization of Declarative Curation Workflows

Data curation is increasingly important. Our previous work on a Kepler curation package has demonstrated advantages that come from automating data curation pipelines by using workflow systems. However, manually designed curation workflows can be error-prone and inefficient due to a lack of user unde...

Full description

Saved in:
Bibliographic Details
Published inInternational Journal of Digital Curation Vol. 9; no. 2
Main Authors Tianhong Song, Sven Köhler, Bertram Ludäscher, James Hanken, Maureen Kelly, David Lowery, James A. Macklin, Paul J. Morris, Robert A. Morris
Format Journal Article
LanguageEnglish
Published University of Edinburgh 01.10.2014
Online AccessGet full text

Cover

Loading…
More Information
Summary:Data curation is increasingly important. Our previous work on a Kepler curation package has demonstrated advantages that come from automating data curation pipelines by using workflow systems. However, manually designed curation workflows can be error-prone and inefficient due to a lack of user understanding of the workflow system, misuse of actors, or human error. Correcting problematic workflows is often very time-consuming. A more proactive workflow system can help users avoid such pitfalls. For example, static analysis before execution can be used to detect the potential problems in a workflow and help the user to improve workflow design. In this paper, we propose a declarative workflow approach that supports semi-automated workflow design, analysis and optimization. We show how the workflow design engine helps users to construct data curation workflows, how the workflow analysis engine detects different design problems of workflows and how workflows can be optimized by exploiting parallelism.
ISSN:1746-8256