Delay-aware resource-efficient interleaved task scheduling strategy in spark

For solving the low CPU and network resource utilization in the task scheduler process of the Spark and Flink computing frameworks, this paper proposes a Delay-Aware Resource-Efficient Interleaved Task Scheduling Strategy (DRTS). This algorithm can schedule parallel tasks in a pipelined fashion, eff...

Full description

Saved in:
Bibliographic Details
Published inComputer Science and Information Systems Vol. 22; no. 3; pp. 839 - 858
Main Authors Zhang, Yanhao, Wang, Congyang, He, Xin, Yu, Junyang, Zhai, Rui, Song, Yalin
Format Journal Article
LanguageEnglish
Published 01.06.2025
Online AccessGet full text
ISSN1820-0214
2406-1018
DOI10.2298/CSIS240831018Z

Cover

Loading…
More Information
Summary:For solving the low CPU and network resource utilization in the task scheduler process of the Spark and Flink computing frameworks, this paper proposes a Delay-Aware Resource-Efficient Interleaved Task Scheduling Strategy (DRTS). This algorithm can schedule parallel tasks in a pipelined fashion, effectively improving the system resource utilization and shortening the job completion times. Firstly, based on historical data of task completion times, we stagger the execution of tasks within the stage with the longest completion time. This helps optimize the utilization of system resources and ensures the smooth completion of the entire pipeline job. Secondly, the execution tasks are categorized into CPU-intensive and non-CPU-intensive phases, which include network I/O and disk I/O operations. During the non-CPU-intensive phase where tasks involve data fetch, parallel tasks are scheduled at suitable intervals to mitigate resource contention and minimize job completion time. Finally, we implemented DRTS on Spark 2.4.0 and conducted experiments to evaluate its performance. The results show that compared to DelayStage, DRTS reduces job execution time by 3.18% to 6.48% and improves CPU and network utilization of the cluster by 6.33% and 7.02%, respectively.
ISSN:1820-0214
2406-1018
DOI:10.2298/CSIS240831018Z