Prediction of stock price direction using a hybrid GA-XGBoost algorithm with a three-stage feature engineering process

[Display omitted] •Good performance with less computational cost than deep learning models.•Better interpretability than deep learning models.•Feature set expansion to gain the blessing of dimensionality.•Optimal feature set to relieve the curse of dimensionality. The stock market has performed one...

Full description

Saved in:

Bibliographic Details
Published in	Expert systems with applications Vol. 186; p. 115716
Main Authors	Yun, Kyung Keun, Yoon, Sang Won, Won, Daehan
Format	Journal Article
Language	English
Published	New York Elsevier Ltd 30.12.2021 Elsevier BV
Subjects	Algorithms Artificial intelligence Big Data Blessing of dimensionality Curse of dimensionality Deep learning Engineering Feature set expansion Genetic algorithm Hybrid systems Indicators Machine learning Optimal feature set Performance prediction Pricing Securities markets Technical indicators XGBoost feature selection Technical indicators Curse of dimensionality Genetic algorithm Blessing of dimensionality XGBoost feature selection Feature set expansion Optimal feature set
Online Access	Get full text

Cover

Loading…

More Information
Summary:	[Display omitted] •Good performance with less computational cost than deep learning models.•Better interpretability than deep learning models.•Feature set expansion to gain the blessing of dimensionality.•Optimal feature set to relieve the curse of dimensionality. The stock market has performed one of the most important functions in a laissez-faire economic system by gathering people, companies, and flows of money for several centuries. There have been numerous studies on the stock market among researchers to predict stock prices, and a growing number of studies employed machine learning or deep learning techniques on the stock market predictions with the advent of big data and the rapid development of artificial intelligence techniques. However, making accurate predictions of stock price direction remains difficult because stock prices are inherently complex, nonlinear, nonstationary, and sometimes too irrational to be predictable. Despite the wealth of information, previous prediction systems often overlooked key indicators and the importance of feature engineering. This study proposes a hybrid GA-XGBoost prediction system with an enhanced feature engineering process consisting of feature set expansion, data preparation, and optimal feature set selection using the hybrid GA-XGBoost algorithm. This study experimentally verifies the importance of feature engineering process in stock price direction prediction by comparing obtained feature sets to original dataset as well as improving prediction performance to outperform benchmark models. Specifically, the most significant accuracy increment comes from feature expansion that adds 67 technical indicators to the original historical stock price data. This study also produces a parsimonious optimal feature set using the GA-XGBoost algorithm that can achieve the desired performance with substantially fewer features. Consequently, this study empirically proves that a successful prediction performance largely depends on a deliberate combination of feature engineering processes with a baseline learning model to make a good balance and harmony between the curse of dimensionality and the blessing of dimensionality.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0957-4174 1873-6793
DOI:	10.1016/j.eswa.2021.115716