Basic Enhancement Strategies When Using Bayesian Optimization for Hyperparameter Tuning of Deep Neural Networks

Compared to the traditional machine learning models, deep neural networks (DNN) are known to be highly sensitive to the choice of hyperparameters. While the required time and effort for manual tuning has been rapidly decreasing for the well developed and commonly used DNN architectures, undoubtedly...

Full description

Saved in:

Bibliographic Details
Published in	IEEE access Vol. 8; pp. 52588 - 52608
Main Authors	Cho, Hyunghun, Kim, Yongjin, Lee, Eunjung, Choi, Daeyoung, Lee, Yongjae, Rhee, Wonjong
Format	Journal Article
Language	English
Published	Piscataway IEEE 2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Artificial neural networks Bayes methods Bayesian analysis Bayesian optimization Benchmark testing Benchmarks Cost analysis Cost function cost function transformation Deep learning Deep neural networks diversification early termination hyperparameter optimization Learning curves Machine learning Neural networks Optimization Parallel processing parallelization Robustness (mathematics) Task analysis Training Tuning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Compared to the traditional machine learning models, deep neural networks (DNN) are known to be highly sensitive to the choice of hyperparameters. While the required time and effort for manual tuning has been rapidly decreasing for the well developed and commonly used DNN architectures, undoubtedly DNN hyperparameter optimization will continue to be a major burden whenever a new DNN architecture needs to be designed, a new task needs to be solved, a new dataset needs to be addressed, or an existing DNN needs to be improved further. For hyperparameter optimization of general machine learning problems, numerous automated solutions have been developed where some of the most popular solutions are based on Bayesian Optimization (BO). In this work, we analyze four fundamental strategies for enhancing BO when it is used for DNN hyperparameter optimization. Specifically, diversification, early termination, parallelization, and cost function transformation are investigated. Based on the analysis, we provide a simple yet robust algorithm for DNN hyperparameter optimization - DEEP-BO (Diversified, Early-termination-Enabled, and Parallel Bayesian Optimization). When evaluated over six DNN benchmarks, DEEP-BO mostly outperformed well-known solutions including GP-Hedge, BOHB, and the speed-up variants that use Median Stopping Rule or Learning Curve Extrapolation. In fact, DEEP-BO consistently provided the top, or at least close to the top, performance over all the benchmark types that we have tested. This indicates that DEEP-BO is a robust solution compared to the existing solutions. The DEEP-BO code is publicly available at https://github.com/snu-adsl/DEEP-BO.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2020.2981072