Robust and Interpretable Convolutional Neural Networks to Detect Glaucoma in Optical Coherence Tomography Images

Recent studies suggest that deep learning systems can now achieve performance on par with medical experts in diagnosis of disease. A prime example is in the field of ophthalmology, where convolutional neural networks (CNNs) have been used to detect retinal and ocular diseases. However, this type of...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on biomedical engineering Vol. 68; no. 8; pp. 2456 - 2466
Main Authors	Thakoor, Kaveri A., Koorathota, Sharath C., Hood, Donald C., Sajda, Paul
Format	Journal Article
Language	English
Published	United States IEEE 01.08.2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Artificial intelligence Artificial neural networks Computer-aided decision support Data models Decision making Deep learning Eye Eye diseases Eye movements eye tracking Feature extraction Glaucoma Learning algorithms Machine learning medical expert systems Medical imaging Neural networks Ophthalmology Optical Coherence Tomography Retina Robustness Testing Tomography Training Transfer learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Recent studies suggest that deep learning systems can now achieve performance on par with medical experts in diagnosis of disease. A prime example is in the field of ophthalmology, where convolutional neural networks (CNNs) have been used to detect retinal and ocular diseases. However, this type of artificial intelligence (AI) has yet to be adopted clinically due to questions regarding robustness of the algorithms to datasets collected at new clinical sites and a lack of explainability of AI-based predictions, especially relative to those of human expert counterparts. In this work, we develop CNN architectures that demonstrate robust detection of glaucoma in optical coherence tomography (OCT) images and test with concept activation vectors (TCAVs) to infer what image concepts CNNs use to generate predictions. Furthermore, we compare TCAV results to eye fixations of clinicians, to identify common decision-making features used by both AI and human experts. We find that employing fine-tuned transfer learning and CNN ensemble learning create end-to-end deep learning models with superior robustness compared to previously reported hybrid deep-learning/machine-learning models, and TCAV/eye-fixation comparison suggests the importance of three OCT report sub-images that are consistent with areas of interest fixated upon by OCT experts to detect glaucoma. The pipeline described here for evaluating CNN robustness and validating interpretable image concepts used by CNNs with eye movements of experts has the potential to help standardize the acceptance of new AI tools for use in the clinic.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0018-9294 1558-2531
DOI:	10.1109/TBME.2020.3043215