QHB+: Accelerated Configuration Optimization for Automated Performance Tuning of Spark SQL Applications

Apache Spark stands out as a well-known solution for big data processing because of its efficiency and rapid processing capabilities. One of its modules, Spark SQL, serves as a prominent big data query engine. However, executing Spark SQL applications with massive data can be time-intensive, and the...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 12; pp. 60138 - 60148
Main Authors Jang, Deokyeon, Yoon, Hyunsik, Jung, Kijung, Chung, Yon Dohn
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Apache Spark stands out as a well-known solution for big data processing because of its efficiency and rapid processing capabilities. One of its modules, Spark SQL, serves as a prominent big data query engine. However, executing Spark SQL applications with massive data can be time-intensive, and the execution time can vary significantly depending on its configurations. Recent studies try to reduce application execution times by searching optimal configurations for applications. While Bayesian optimization is recognized as a powerful method in recent studies for configuration optimization, it faces challenges such as computational costs and time-consuming computations, especially when dealing with large search spaces Due to these challenges, we propose QHB+, designed to rapidly search optimal configurations. QHB+ utilizes the Successive Halving Algorithm-based optimization methods, performing well in hyperparameter optimization of machine learning models, for configuration optimization of Spark SQL applications. Through empirical evaluations against established benchmarks, we show the efficiency of QHB+, highlighting them as swift alternatives to conventional optimization method for optimizing Spark SQL configurations.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3391333