SWARM: Scheduling Large-Scale Jobs over the Loosely-Coupled HPC Clusters

Compute-intensive scientific applications are heavily reliant on the available quantity of computing resources. The Grid paradigm provides a large scale computing environment for scientific users. However, conventional Grid job submission tools do not provide a high-level job scheduling environment...

Full description

Saved in:
Bibliographic Details
Published in2008 IEEE Fourth International Conference on eScience pp. 285 - 292
Main Authors Pallickara, S.L., Pierce, M.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2008
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Compute-intensive scientific applications are heavily reliant on the available quantity of computing resources. The Grid paradigm provides a large scale computing environment for scientific users. However, conventional Grid job submission tools do not provide a high-level job scheduling environment for these users across multiple institutions. For extremely large number of jobs, a more scalable job scheduling framework that can leverage highly distributed clusters and supercomputers is required. In this paper, we propose a high-level job scheduling Web service framework, Swarm. Swarm is developed for scientific applications that must submit massive number of high-throughput jobs or workflows to highly distributed computing clusters. The Swarm service itself is designed to be extensible, lightweight, and easily installable on a desktop or small server. As a Web service, derivative services based on Swarm can be straightforwardly integrated with Web portals and science gateways. This paper provides the motivation for this research, the architecture of the Swarm framework, and a performance evaluation of the system prototype.
ISBN:9781424433803
1424433800
DOI:10.1109/eScience.2008.64