Randomized Algorithms for Scheduling Multi-Resource Jobs in the Cloud

We consider the problem of scheduling jobs with multiple-resource requirements (CPU, memory, and disk) in a distributed server platform, motivated by data-parallel and cloud computing applications. Jobs arrive dynamically over time and require certain amount of multiple resources for the duration of...

Full description

Saved in:

Bibliographic Details
Published in	IEEE/ACM transactions on networking Vol. 26; no. 5; pp. 2202 - 2215
Main Authors	Psychas, Konstantinos, Ghaderi, Javad
Format	Journal Article
Language	English
Published	New York IEEE 01.10.2018 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Approximation algorithms Cloud computing Combinatorial analysis Complexity Complexity theory Computer simulation Computing time datacenter Delays knapsack problem Markov chains Queues Randomization Resource allocation Resource scheduling Scheduling Scheduling algorithms Servers stability Task analysis Throughput
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We consider the problem of scheduling jobs with multiple-resource requirements (CPU, memory, and disk) in a distributed server platform, motivated by data-parallel and cloud computing applications. Jobs arrive dynamically over time and require certain amount of multiple resources for the duration of their service. When a job arrives, it is queued and later served by one of the servers that has sufficient remaining resources to serve it. The scheduling of jobs is subject to two constraints: 1) packing constraints : multiple jobs can be served simultaneously by a single server if their cumulative resource requirement does not exceed the capacity of the server, and 2) non-preemption : to avoid costly preemptions, once a job is scheduled in a server, its service cannot be interrupted or migrated to another server. Prior scheduling algorithms rely on either bin packing heuristics which have low complexity but can have a poor throughput, or MaxWeight solutions that can achieve maximum throughput but repeatedly require to solve or approximate instances of a hard combinatorial problem (Knapsack) over time. In this paper, we propose a randomized scheduling algorithm for placing jobs in servers that can achieve maximum throughput with low complexity. The algorithm is naturally distributed and each queue and each server needs to perform only a constant number of operations per time unit. Extensive simulation results, using both synthetic and real traffic traces, are presented to evaluate the throughput and delay performance compared to prior algorithms.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1063-6692 1558-2566
DOI:	10.1109/TNET.2018.2863647