Hybrid Job Scheduling for Improved Cluster Utilization

In this paper, we investigate the models and issues as well as performance benefits of hybrid job scheduling over shared physical clusters. Clustering technologies that are currently supported include MPI, Hadoop-MapReduce and NoSQL systems. Our proposed scheduling model is above the cluster-specifi...

Full description

Saved in:

Bibliographic Details
Published in	Euro-Par 2013: Parallel Processing Workshops pp. 395 - 405
Main Authors	Ari, Ismail, Kocak, Ugur
Format	Book Chapter
Language	English
Published	Berlin, Heidelberg Springer Berlin Heidelberg 2014
Series	Lecture Notes in Computer Science
Subjects	Cluster Technology Hadoop Distribute File System High Performance Computing High Throughput Computing Message Passing Interface
Online Access	Get full text
ISBN	3642544193 9783642544194
ISSN	0302-9743 1611-3349
DOI	10.1007/978-3-642-54420-0_39

Cover

Loading…

More Information
Summary:	In this paper, we investigate the models and issues as well as performance benefits of hybrid job scheduling over shared physical clusters. Clustering technologies that are currently supported include MPI, Hadoop-MapReduce and NoSQL systems. Our proposed scheduling model is above the cluster-specific middleware and OS-level schedulers and it is complementary to them. First, we demonstrate that we can effectively schedule MPI, Hadoop, NoSQL jobs together by profiling them and then co-scheduling. Second, we find that it is better to schedule cluster jobs with different job characteristics together (CPU vs. I/O intensive) rather than two CPU-intensive jobs. Third, we use the learning outcome of this principle to design of a greedy sort-merge scheduler. Up to 37% savings in total job completion times are demonstrated. These savings are directly proportional to the cluster utilization improvements.
ISBN:	3642544193 9783642544194
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-642-54420-0_39