GPT-4V(ision) Unsuitable for Clinical Care and Education: A Clinician-Evaluated Assessment

OpenAI's large multimodal model, GPT-4V(ision), was recently developed for general image interpretation. However, less is known about its capabilities with medical image interpretation and diagnosis. Board-certified physicians and senior residents assessed GPT-4V's proficiency across a ran...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Senkaiahliyan, Senthujan, Toma, Augustin, Ma, Jun, An-Wen, Chan, Ha, Andrew, An, Kevin R, Suresh, Hrishikesh, Rubin, Barry, Wang, Bo
Format	Paper Journal Article
Language	English
Published	Ithaca Cornell University Library, arXiv.org 14.11.2023
Subjects	Computed tomography Computer Science - Computer Vision and Pattern Recognition Decision making Education Large language models Medical imaging
Online Access	Get full text

Cover

Loading…

More Information
Summary:	OpenAI's large multimodal model, GPT-4V(ision), was recently developed for general image interpretation. However, less is known about its capabilities with medical image interpretation and diagnosis. Board-certified physicians and senior residents assessed GPT-4V's proficiency across a range of medical conditions using imaging modalities such as CT scans, MRIs, ECGs, and clinical photographs. Although GPT-4V is able to identify and explain medical images, its diagnostic accuracy and clinical decision-making abilities are poor, posing risks to patient safety. Despite the potential that large language models may have in enhancing medical education and delivery, the current limitations of GPT-4V in interpreting medical images reinforces the importance of appropriate caution when using it for clinical decision-making.
ISSN:	2331-8422
DOI:	10.48550/arxiv.2403.12046