Symbolic Regression with augmented dataset using RuleFit

Symbolic Regression models are often associated with transparency and interpretability. The main motivation is their ability to describe nonlinear models balancing accuracy and conciseness. But, in practice, it may generate models that are hard to understand at the same level as opaque models. From...

Full description

Saved in:
Bibliographic Details
Published in2022 24th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC) pp. 323 - 326
Main Author Olivetti de Franca, Fabricio
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.09.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Symbolic Regression models are often associated with transparency and interpretability. The main motivation is their ability to describe nonlinear models balancing accuracy and conciseness. But, in practice, it may generate models that are hard to understand at the same level as opaque models. From another perspective, linear models are guaranteed to be transparent but fail to model nonlinearities and interactions. The algorithm RuleFit uses a tree-based nonlinear model to create meta-features augmenting the dataset, increasing the accuracy of the linear models while maintaining their transparency. In this paper we test whether this augmented dataset can help Symbolic Regression models to find more transparent models without reducing the overall accuracy. The results indicate that the augmented models have a slightly better accuracy on a class of benchmarks while keeping the expression size small and closer to a linear model. As a caveat, the models also tend to become closer to a step function which limits the interpretability of the studied phenomena.
ISSN:2470-881X
DOI:10.1109/SYNASC57785.2022.00058