The utility of artificial intelligence platforms for patient‐generated questions in Mohs micrographic surgery: a multi‐national, blinded expert panel evaluation

Background Artificial intelligence (AI) and large language models (LLMs) transform how patients inform themselves. LLMs offer potential as educational tools, but their quality depends upon the information generated. Current literature examining AI as an informational tool in dermatology has been lim...

Full description

Saved in:

Bibliographic Details
Published in	International journal of dermatology Vol. 63; no. 11; pp. 1592 - 1598
Main Authors	Lauck, Kyle C., Cho, Seo Won, DaCunha, Matthew, Wuennenberg, John, Aasi, Sumaira, Alam, Murad, Arron, Sarah T., Bar, Anna, Brodland, David G., Cerci, Felipe B., Cohen, Joel L., Coldiron, Brett, Council, M. Laurin, Harmon, Christopher B., Hruza, George, Läuchli, Severin, Moody, Brent R., Wysong, Ashley S., Zitelli, John A., Tolkachjov, Stanislav N.
Format	Journal Article
Language	English
Published	England Blackwell Publishing Ltd 01.11.2024
Subjects	Artificial intelligence dermatologic surgery Dermatology Education Evaluation Information processing language learning model Large language models Micrography Mohs micrographic surgery patient education Patients Questions Search engines Sensory evaluation skin cancer education Surgery Surgical instruments Mohs micrographic surgery patient education AI skin cancer education language learning model artificial intelligence dermatologic surgery
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Background Artificial intelligence (AI) and large language models (LLMs) transform how patients inform themselves. LLMs offer potential as educational tools, but their quality depends upon the information generated. Current literature examining AI as an informational tool in dermatology has been limited in evaluating AI's multifaceted roles and diversity of opinions. Here, we evaluate LLMs as a patient‐educational tool for Mohs micrographic surgery (MMS) in and out of the clinic utilizing an international expert panel. Methods The most common patient MMS questions were extracted from Google and transposed into two LLMs and Google's search engine. 15 MMS surgeons evaluated the generated responses, examining their appropriateness as a patient‐facing informational platform, sufficiency of response in a clinical environment, and accuracy of content generated. Validated scales were employed to assess the comprehensibility of each response. Results The majority of reviewers deemed all LLM responses appropriate. 75% of responses were rated as mostly accurate or higher. ChatGPT had the highest mean accuracy. The majority of the panel deemed 33% of responses sufficient for clinical practice. The mean comprehensibility scores for all platforms indicated a required 10th‐grade reading level. Conclusions LLM‐generated responses were rated as appropriate patient informational sources and mostly accurate in their content. However, these platforms may not provide sufficient information to function in a clinical environment, and complex comprehensibility may represent a barrier to utilization. As the popularity of these platforms increases, it is important for dermatologists to be aware of these limitations.
Bibliography:	Conflict of interest: Dr. Tolkachjov is a speaker/investigator for CASTLE Biosciences and Bioventus/LifeNet. The other authors have no relevant COI to disclose. Funding source: None. ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0011-9059 1365-4632 1365-4632
DOI:	10.1111/ijd.17382