Design and implementation of GXP make — A workflow system based on make

This paper describes a rationale behind designing workflow systems based on the Unix make by showing a number of idioms useful for workflows comprising many tasks. It also demonstrates a specific design and implementation of such a workflow system called GXP make. GXP make supports all the features...

Full description

Saved in:
Bibliographic Details
Published inFuture generation computer systems Vol. 29; no. 2; pp. 662 - 672
Main Authors Taura, Kenjiro, Matsuzaki, Takuya, Miwa, Makoto, Kamoshida, Yoshikazu, Yokoyama, Daisaku, Dun, Nan, Shibata, Takeshi, Jun, Choi Sung, Tsujii, Jun’ichi
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.02.2013
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper describes a rationale behind designing workflow systems based on the Unix make by showing a number of idioms useful for workflows comprising many tasks. It also demonstrates a specific design and implementation of such a workflow system called GXP make. GXP make supports all the features of GNU make and extends its platforms from single node systems to clusters, clouds, supercomputers, and distributed systems. Interestingly, it is achieved by a very small code base that does not modify GNU make implementation at all. While not being ideal for performance, it achieved a useful performance and scalability of dispatching one million tasks in approximately 5000 s (200 tasks per second, including dependence analysis) on an 8 core Intel Nehalem node. For real applications, recognition and classification of protein–protein interactions from biomedical texts on a supercomputer with more than 8000 cores are described. ► With GXP make, you can run your makefiles in parallel in clusters and distributed environments. ► It attains 200 tasks/s throughput while guaranteeing near perfect compatibility with GNU make. ► Many features of makefile useful for describing workflows of many tasks are uncovered. ► GXP make facilitates a smooth transition from single-node (multicore) environments to clusters and supercomputers.
ISSN:0167-739X
1872-7115
DOI:10.1016/j.future.2011.05.026