Machine learning can aid in prediction of IDH mutation from H&E-stained histology slides in infiltrating gliomas

While Machine Learning (ML) models have been increasingly applied to a range of histopathology tasks, there has been little emphasis on characterizing these models and contrasting them with human experts. We present a detailed empirical analysis comparing expert neuropathologists and ML models at pr...

Full description

Saved in:
Bibliographic Details
Published inScientific reports Vol. 12; no. 1; p. 22623
Main Authors Liechty, Benjamin, Xu, Zhuoran, Zhang, Zhilu, Slocum, Cheyanne, Bahadir, Cagla D., Sabuncu, Mert R., Pisapia, David J.
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 31.12.2022
Nature Publishing Group
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:While Machine Learning (ML) models have been increasingly applied to a range of histopathology tasks, there has been little emphasis on characterizing these models and contrasting them with human experts. We present a detailed empirical analysis comparing expert neuropathologists and ML models at predicting IDH mutation status in H&E-stained histology slides of infiltrating gliomas, both independently and synergistically. We find that errors made by neuropathologists and ML models trained using the TCGA dataset are distinct, representing modest agreement between predictions (human-vs.-human κ = 0.656; human-vs.-ML model κ = 0.598). While no ML model surpassed human performance on an independent institutional test dataset (human AUC = 0.901, max ML AUC = 0.881), a hybrid model aggregating human and ML predictions demonstrates predictive performance comparable to the consensus of two expert neuropathologists (hybrid classifier AUC = 0.921 vs. two-neuropathologist consensus AUC = 0.920). We also show that models trained at different levels of magnification exhibit different types of errors, supporting the value of aggregation across spatial scales in the ML approach. Finally, we present a detailed interpretation of our multi-scale ML ensemble model which reveals that predictions are driven by human-identifiable features at the patch-level.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2045-2322
2045-2322
DOI:10.1038/s41598-022-26170-6