An empirical analysis of scheduling techniques for real-time cloud-based data processing

In this paper, we explore the challenges and needs of current cloud infrastructures, to better support cloud-based data-intensive applications that are not only latency-sensitive but also require strong timing guarantees. These applications have strict deadlines (e.g., to perform time-dependent miss...

Full description

Saved in:
Bibliographic Details
Published in2011 IEEE International Conference on Service-Oriented Computing and Applications (SOCA) pp. 1 - 8
Main Authors Phan, L. T. X., Zhuoyao Zhang, Qi Zheng, Boon Thau Loo, Insup Lee
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2011
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this paper, we explore the challenges and needs of current cloud infrastructures, to better support cloud-based data-intensive applications that are not only latency-sensitive but also require strong timing guarantees. These applications have strict deadlines (e.g., to perform time-dependent mission critical tasks or to complete real-time control decisions using a human-in-the-loop), and deadline misses are undesirable. To highlight the challenges in this space, we provide a case study of the online scheduling of MapReduce jobs executed by Hadoop. Our evaluations on Amazon EC2 show that the existing Hadoop scheduler is ill-equipped to handle jobs with deadlines. However, by adapting existing multiprocessor scheduling techniques for the cloud environment, we observe significant performance improvements in minimizing missed deadlines and tardiness. Based on our case study, we discuss a range of challenges in this domain posed by virtualization and scale, and propose our research agenda centered around the application of advanced real-time scheduling techniques in the cloud environment.
ISBN:9781467303187
1467303186
ISSN:2163-2871
2689-7121
DOI:10.1109/SOCA.2011.6166240