Coherent Interpretation of Entire Visual Field Test Reports Using a Multimodal Large Language Model (ChatGPT)
This study assesses the accuracy and consistency of a commercially available large language model (LLM) in extracting and interpreting sensitivity and reliability data from entire visual field (VF) test reports for the evaluation of glaucomatous defects. Single-page anonymised VF test reports from 6...
Saved in:
Published in | Vision (Basel) Vol. 9; no. 2; p. 33 |
---|---|
Main Author | |
Format | Journal Article |
Language | English |
Published |
Switzerland
MDPI AG
11.04.2025
MDPI |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | This study assesses the accuracy and consistency of a commercially available large language model (LLM) in extracting and interpreting sensitivity and reliability data from entire visual field (VF) test reports for the evaluation of glaucomatous defects. Single-page anonymised VF test reports from 60 eyes of 60 subjects were analysed by an LLM (ChatGPT 4o) across four domains—test reliability, defect type, defect severity and overall diagnosis. The main outcome measures were accuracy of data extraction, interpretation of glaucomatous field defects and diagnostic classification. The LLM displayed 100% accuracy in the extraction of global sensitivity and reliability metrics and in classifying test reliability. It also demonstrated high accuracy (96.7%) in diagnosing whether the VF defect was consistent with a healthy, suspect or glaucomatous eye. The accuracy in correctly defining the type of defect was moderate (73.3%), which only partially improved when provided with a more defined region of interest. The causes of incorrect defect type were mostly attributed to the wrong location, particularly confusing the superior and inferior hemifields. Numerical/text-based data extraction and interpretation was overall notably superior to image-based interpretation of VF defects. This study demonstrates the potential and also limitations of multimodal LLMs in processing multimodal medical investigation data such as VF reports. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ISSN: | 2411-5150 2411-5150 |
DOI: | 10.3390/vision9020033 |