Regularized target encoding outperforms traditional methods in supervised machine learning with high cardinality features
Since most machine learning (ML) algorithms are designed for numerical inputs, efficiently encoding categorical variables is a crucial aspect in data analysis. A common problem are high cardinality features, i.e. unordered categorical predictor variables with a high number of levels. We study techni...
Saved in:
Published in | Computational statistics Vol. 37; no. 5; pp. 2671 - 2692 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Berlin/Heidelberg
Springer Berlin Heidelberg
01.11.2022
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Be the first to leave a comment!