Stargazer: Toward efficient data analytics scheduling via task completion time inference

The fundamental challenge of data analytics scheduling is the heterogeneity of both data analytics jobs and resources. Although many scheduling solutions have been developed to improve the efficiency of data analytics frameworks (e.g., Spark), they either (1) focus on the scheduling of a single type...

Full description

Saved in:

Bibliographic Details
Published in	Computers & electrical engineering Vol. 92; p. 107092
Main Authors	Du, Haizhou, Zhang, Keke, Xiang, Qiao
Format	Journal Article
Language	English
Published	Amsterdam Elsevier Ltd 01.06.2021 Elsevier BV
Subjects	Completion time Computation complexity Data analysis Data locality Deep learning Delay scheduling Heterogeneity Resource scheduling Schedules Scheduling Spark scheduling optimization Task scheduling Deep learning Delay scheduling Computation complexity Spark scheduling optimization Data locality
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The fundamental challenge of data analytics scheduling is the heterogeneity of both data analytics jobs and resources. Although many scheduling solutions have been developed to improve the efficiency of data analytics frameworks (e.g., Spark), they either (1) focus on the scheduling of a single type of resource, without considering the coordination between different resources; or (2) schedule multiple resources by factoring in limited information about analytics jobs without considering the heterogeneity of resources. This paper presents Stargazer, a novel, efficient system that tackles diversity data analytics jobs on heterogeneous cluster by inferring the completion times of their decomposed tasks. Specifically, Stargazer adopts a deep learning model, which takes into considerations multiple key factors of diversity data analytics jobs and heterogeneous resources, to accurately infer the completion time of different tasks. A prototype of Stargazer is fully implemented in the Spark framework. Extensive experiments show that Stargazer can reduce the average job completion time by 21% and improve average performance by 20%, while incurring little overhead.
ISSN:	0045-7906 1879-0755
DOI:	10.1016/j.compeleceng.2021.107092