Multi-objective Symbolic Regression to Generate Data-driven, Non-fixed Structure and Intelligible Mortality Predictors using EHR: Binary Classification Methodology and Comparison with State-of-the-art

Symbolic Regression (SR) is a data-driven methodology based on Genetic Programming, and it is widely used to produce arithmetic expressions for modelling learning tasks. Compared to other popular statistical techniques, SR outcomes are given by an arbitrary set of mathematical operations, representi...

Full description

Saved in:
Bibliographic Details
Published inAMIA ... Annual Symposium proceedings Vol. 2022; pp. 442 - 451
Main Authors Ferrari, Davide, Guidetti, Veronica, Wang, Yanzhong, Curcin, Vasa
Format Journal Article
LanguageEnglish
Published American Medical Informatics Association 29.04.2023
Online AccessGet full text

Cover

Loading…
More Information
Summary:Symbolic Regression (SR) is a data-driven methodology based on Genetic Programming, and it is widely used to produce arithmetic expressions for modelling learning tasks. Compared to other popular statistical techniques, SR outcomes are given by an arbitrary set of mathematical operations, representing arbitrarily complex linear and non-linear functions without a predefined fixed structure. Another advantage is that, unlike other machine learning algorithms, SR produces interpretable results. In this paper, we explore the qualities and limitations of this technique in a novel implementation as a binary classifier for in-hospital or short-term mortality prediction in patients with Covid-19. Our results highlight that SR provides a competitive alternative to popular statistical and machine learning methodologies to model relevant clinical phenomena thanks to good classification performance, stability in unbalanced dataset management, and intrinsic interpretability.
ISSN:1559-4076