Evolutionary Strategies Enable Systematic and Reliable Uncertainty Quantification: A Proof-of-Concept Pilot Study on Resting-State Functional MRI Language Lateralization
Reliable and trustworthy artificial intelligence (AI), particularly in high-stake medical diagnoses, necessitates effective uncertainty quantification (UQ). Existing UQ methods using model ensembles often introduce invalid variability or computational complexity, rendering them impractical and ineff...
Saved in:
Published in | Journal of imaging informatics in medicine |
---|---|
Main Authors | , , , , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
Switzerland
09.07.2024
|
Subjects | |
Online Access | Get more information |
Cover
Loading…
Summary: | Reliable and trustworthy artificial intelligence (AI), particularly in high-stake medical diagnoses, necessitates effective uncertainty quantification (UQ). Existing UQ methods using model ensembles often introduce invalid variability or computational complexity, rendering them impractical and ineffective in clinical workflow. We propose a UQ approach based on deep neuroevolution (DNE), a data-efficient optimization strategy. Our goal is to replicate trends observed in expert-based UQ. We focused on language lateralization maps from resting-state functional MRI (rs-fMRI). Fifty rs-fMRI maps were divided into training/testing (30:20) sets, representing two labels: "left-dominant" and "co-dominant." DNE facilitated acquiring an ensemble of 100 models with high training and testing set accuracy. Model uncertainty was derived from distribution entropies over the 100 model predictions. Expert reviewers provided user-based uncertainties for comparison. Model (epistemic) and user-based (aleatoric) uncertainties were consistent in the independently and identically distributed (IID) testing set, mainly indicating low uncertainty. In a mostly out-of-distribution (OOD) holdout set, both model and user-based entropies correlated but displayed a bimodal distribution, with one peak representing low and another high uncertainty. We also found a statistically significant positive correlation between epistemic and aleatoric uncertainties. DNE-based UQ effectively mirrored user-based uncertainties, particularly highlighting increased uncertainty in OOD images. We conclude that DNE-based UQ correlates with expert assessments, making it reliable for our use case and potentially for other radiology applications. |
---|---|
ISSN: | 2948-2933 |
DOI: | 10.1007/s10278-024-01188-6 |