Applied forecasting for delayed cerebral ischemia prediction post subarachnoid hemorrhage: Methodological fallacies

Delayed Cerebral Ischemia (DCI) is an important cause of morbidity and mortality after aneurysmal Subarachnoid Hemorrhage (aSAH). Researchers have utilized various methods for predicting patients at risk for DCI progression. An eight-year retrospective review of aSAH patients who presented to St Lou...

Full description

Saved in:
Bibliographic Details
Published inInformatics in medicine unlocked Vol. 28; p. 100817
Main Authors Alexopoulos, Georgios, Zhang, Justin, Karampelas, Ioannis, Khan, Maheen, Quadri, Nabiha, Patel, Mayur, Patel, Niel, Almajali, Mohammad, Mattei, Tobias A., Kemp, Joanna, Coppens, Jeroen, Mercier, Philippe
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 2022
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Delayed Cerebral Ischemia (DCI) is an important cause of morbidity and mortality after aneurysmal Subarachnoid Hemorrhage (aSAH). Researchers have utilized various methods for predicting patients at risk for DCI progression. An eight-year retrospective review of aSAH patients who presented to St Louis University Hospital. The records were screened for demographic, clinical, and radiographic parameters. DCI was the primary outcome. We identified 16 features to fit various forecasting models and selected the best binary classifier through comprehensive machine learning (ML) workflows. Regression and ensemble tree-based algorithms were utilized, based on their performance on tabular data. We investigated whether a single model could outperform in our dataset. Due to the expected outcome class imbalance (DCI), we selected precision, recall, and F-score as threshold metrics. Precision-recall curves were used for model performance ranking. Of the 213 aSAH patients analyzed, 42 progressed to DCI (19.7%). The mean age was 55.7 years. The outcome variable (DCI) was imbalanced with a class ratio of 1:4. Bivariate analysis revealed two significant associations: The “Hunt-and-Hess scale” (p-value = 0.016), and “Posthemorrhagic hydrocephalus” (p-value < 0.001). The all-relevant important factors during feature selection were: “Fisher scale,” “Modified Fisher scale,” “Hunt-and-Hess scale,” and “Posthemorrhagic hydrocephalus”. “Treatment type” was tentative. The random forests model achieved a pooled accuracy of 71.1% (95%CI: 60.4, 83.4) with an F1-score of 0.484. The best binary classifier utilized extreme gradient boosting while trained on the all-relevant predictors plus “Aneurysm type.” Extreme gradient boosting achieved a predictive accuracy of 84.3% (95%CI: 75.9, 93.4) with an F1-score of 0.684. We describe the challenges that arise during training of a binary classifier on imbalanced datasets, and, while going through an extensive comparison review of similar published studies, we not only demonstrate the model's performance but also identify multiple forecasting methodological fallacies in neurological research. By implementing baseline patient characteristics combined with radiographic grading scales, we built a simple yet robust, highly accurate—but, most importantly—useful binary classifier for DCI prediction. The model is available online, and it can be utilized clinically as an effective forecasting tool (https://georgiosalexopoulos.shinyapps.io/download/). •We propose the best binary classifier for early prediction of delayed cerebral ischemia in patients with ruptured intracranial aneurysms.•A comprehensive machine learning workflow for delayed cerebral ischemia forecasting.•Highlighting the challenges of binary classifier training on imbalanced data and identify methodological AI fallacies in neurological research.
ISSN:2352-9148
2352-9148
DOI:10.1016/j.imu.2021.100817