Return prediction by machine learning for the Korean stock market

In this study, we aim to forecast monthly stock returns and analyze factors influencing stock prices in the Korean stock market. To find a model that maximizes the cumulative return of the portfolio of stocks with high predicted returns, we use machine learning models such as linear models, tree-bas...

Full description

Saved in:
Bibliographic Details
Published inJournal of the Korean Statistical Society Vol. 53; no. 1; pp. 248 - 280
Main Authors Choi, Wonwoo, Jang, Seongho, Kim, Sanghee, Park, Chayoung, Park, Sunyoung, Song, Seongjoo
Format Journal Article
LanguageEnglish
Published Singapore Springer Nature Singapore 01.03.2024
한국통계학회
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this study, we aim to forecast monthly stock returns and analyze factors influencing stock prices in the Korean stock market. To find a model that maximizes the cumulative return of the portfolio of stocks with high predicted returns, we use machine learning models such as linear models, tree-based models, neural networks, and learning to rank algorithms. We employ a novel validation metric which we call the Cumulative net Return of a Portfolio with top 10% predicted return (CRP10) for tuning hyperparameters to increase the cumulative return of the selected portfolio. CRP10 tends to provide higher cumulative returns compared to out-of-sample R-squared as a validation metric with the data that we used. Our findings indicate that Light Gradient Boosting Machine (LightGBM) and Gradient Boosted Regression Trees (GBRT) demonstrate better performance than other models when we apply a single model for the entire test period. We also take the strategy of changing the model on a yearly basis by assessing the best model annually and observed that it did not outperform the approach of using a single model such as LightGBM or GBRT for the entire period.
ISSN:1226-3192
2005-2863
DOI:10.1007/s42952-023-00245-0