A Hadoop configuration optimization method based on middle platform business operation requirements
The middle platform business is the key infrastructure for the digital transformation of the power grid. The big data analysis and processing framework driven by the middle platform business has a wide variety of configuration parameters, complex meanings and interrelated effects. It is difficult to...
Saved in:
Published in | 2023 IEEE International Conference on Sensors, Electronics and Computer Engineering (ICSECE) pp. 1211 - 1216 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
18.08.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The middle platform business is the key infrastructure for the digital transformation of the power grid. The big data analysis and processing framework driven by the middle platform business has a wide variety of configuration parameters, complex meanings and interrelated effects. It is difficult to achieve fast and accurate optimization and improve middle platform business processing performance. This paper proposed an effective hadoop configuration parameter tuning method. First, the K-means clustering algorithm is used to classify the resource utilization characteristics of middle platform business operations; Then, a performance model of middle platform business operations was constructed based on the random forest algorithm; Finally, it finds the optimal configuration of the parameter configuration space. Experiment results show that the performance model based on random forest can effectively predict the running time of the application. It can achieve the effect of optimizing execution efficiency with the optimization of parameters based on the performance model . |
---|---|
DOI: | 10.1109/ICSECE58870.2023.10263541 |