Improving Clinical Predictions with Multi-Modal Pre-training in Retinal Imaging
Self-supervised learning has emerged as a foundational approach for creating robust and adaptable artificial intelligence (AI) systems within medical imaging. Specifically, contrastive representation learning methods, trained on extensive multi-modal datasets, have showcased remarkable proficiency i...
Saved in:
Published in | 2024 IEEE International Symposium on Biomedical Imaging (ISBI) pp. 1 - 5 |
---|---|
Main Authors | , , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
27.05.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Self-supervised learning has emerged as a foundational approach for creating robust and adaptable artificial intelligence (AI) systems within medical imaging. Specifically, contrastive representation learning methods, trained on extensive multi-modal datasets, have showcased remarkable proficiency in generating highly adaptable representations suitable for a multitude of downstream tasks. In the field of ophthalmology, modern retinal imaging devices capture both 2D fundus images and 3D optical coherence tomography (OCT) scans. As a result, large multi-modal imaging datasets are readily available and allow us to explore uni-modal versus multi-modal contrastive pre-training. After pre-training on 153,306 scan pairs, we showcase the transferability and efficacy of these acquired representations via fine-tuning on multiple external datasets, explicitly focusing on several clinically pertinent prediction tasks derived from OCT data. Additionally, we illustrate how multi-modal pre-training enhances the exchange of information between OCT, a richer modality, and the more cost-effective fundus imaging, ultimately amplifying the predictive capacity of fundus-based models. |
---|---|
ISSN: | 1945-8452 |
DOI: | 10.1109/ISBI56570.2024.10635447 |