Predicting Glaucoma Before Onset Using a Large Language Model Chatbot

To investigate the capability of ChatGPT for forecasting the conversion from ocular hypertension (OHT) to glaucoma based on the Ocular Hypertension Treatment Study (OHTS). Retrospective case-control study. A total of 3008 eyes of 1504 subjects from the OHTS were included in the study. We selected de...

Full description

Saved in:
Bibliographic Details
Published inAmerican journal of ophthalmology Vol. 266; pp. 289 - 299
Main Authors Huang, Xiaoqin, Raja, Hina, Madadi, Yeganeh, Delsoz, Mohammad, Poursoroush, Asma, Kahook, Malik Y., Yousefi, Siamak
Format Journal Article
LanguageEnglish
Published United States Elsevier Inc 01.10.2024
Online AccessGet full text

Cover

Loading…
More Information
Summary:To investigate the capability of ChatGPT for forecasting the conversion from ocular hypertension (OHT) to glaucoma based on the Ocular Hypertension Treatment Study (OHTS). Retrospective case-control study. A total of 3008 eyes of 1504 subjects from the OHTS were included in the study. We selected demographic, clinical, ocular, optic nerve head, and visual field (VF) parameters 1 year before glaucoma development from the OHTS participants. Subsequently, we developed queries by converting tabular parameters into textual format based on both eyes of all participants. We used the ChatGPT application program interface (API) to automatically perform ChatGPT prompting for all subjects. We then investigated whether ChatGPT can accurately forecast conversion from OHT to glaucoma based on various objective metrics. Accuracy, area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and weighted F1 score. ChatGPT4.0 demonstrated an accuracy of 75%, AUC of 0.67, sensitivity of 56%, specificity of 78%, and weighted F1 score of 0.77 in predicting conversion to glaucoma 1 year before onset. ChatGPT3.5 provided an accuracy of 61%, AUC of 0.62, sensitivity of 64%, specificity of 59%, and weighted F1 score of 0.63 in predicting conversion to glaucoma 1 year before onset. The performance of ChatGPT4.0 in forecasting development of glaucoma 1 year before onset was reasonable. The overall performance of ChatGPT4.0 was consistently higher than ChatGPT3.5. Large language models (LLMs) hold great promise for augmenting glaucoma research capabilities and enhancing clinical care. Future efforts in creating ophthalmology-specific LLMs that leverage multimodal data in combination with active learning may lead to more useful integration with clinical practice and deserve further investigations.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0002-9394
1879-1891
1879-1891
DOI:10.1016/j.ajo.2024.05.022