Trialling a Large Language Model (ChatGPT) in General Practice With the Applied Knowledge Test: Observational Study Demonstrating Opportunities and Limitations in Primary Care

Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be...

Full description

Saved in:

Bibliographic Details
Published in	JMIR medical education Vol. 9; p. e46599
Main Authors	Thirunavukarasu, Arun James, Hassan, Refaat, Mahmood, Shathar, Sanghera, Rohan, Barzangi, Kara, El Mukashfi, Mohanned, Shah, Sachin
Format	Journal Article
Language	English
Published	Canada JMIR Publications 21.04.2023
Subjects	Accuracy Chatbots Deep learning Design Large language models Medical research Multimedia Observational studies Original Paper Primary care Statistical analysis United Kingdom > UK chatbot deep learning decision support techniques natural language processing ChatGPT general practice AI primary care family medicine large language model artificial intelligence
Online Access	Get full text

Cover

Loading…

Abstract	Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners. Here, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium. AKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model's answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners' reports from 2018 to 2022. Novel explanations from ChatGPT-defined as information provided that was not inputted within the question or multiple answer choices-were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT's strengths and weaknesses. Average overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT's performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=-0.241 and -0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23). Large language models are approaching human expert-level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis.
AbstractList	BackgroundLarge language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners. ObjectiveHere, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium. MethodsAKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model’s answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners’ reports from 2018 to 2022. Novel explanations from ChatGPT—defined as information provided that was not inputted within the question or multiple answer choices—were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT’s strengths and weaknesses. ResultsAverage overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT’s performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=–0.241 and –0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23). ConclusionsLarge language models are approaching human expert–level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis. Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners.BACKGROUNDLarge language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners.Here, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium.OBJECTIVEHere, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium.AKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model's answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners' reports from 2018 to 2022. Novel explanations from ChatGPT-defined as information provided that was not inputted within the question or multiple answer choices-were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT's strengths and weaknesses.METHODSAKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model's answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners' reports from 2018 to 2022. Novel explanations from ChatGPT-defined as information provided that was not inputted within the question or multiple answer choices-were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT's strengths and weaknesses.Average overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT's performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=-0.241 and -0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23).RESULTSAverage overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT's performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=-0.241 and -0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23).Large language models are approaching human expert-level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis.CONCLUSIONSLarge language models are approaching human expert-level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis. Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners. Here, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium. AKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model's answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners' reports from 2018 to 2022. Novel explanations from ChatGPT-defined as information provided that was not inputted within the question or multiple answer choices-were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT's strengths and weaknesses. Average overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT's performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=-0.241 and -0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23). Large language models are approaching human expert-level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis. Background:Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners.Objective:Here, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium.Methods:AKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model’s answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners’ reports from 2018 to 2022. Novel explanations from ChatGPT—defined as information provided that was not inputted within the question or multiple answer choices—were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT’s strengths and weaknesses.Results:Average overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT’s performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=–0.241 and –0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23).Conclusions:Large language models are approaching human expert–level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis.
Author	Sanghera, Rohan Barzangi, Kara Thirunavukarasu, Arun James El Mukashfi, Mohanned Mahmood, Shathar Hassan, Refaat Shah, Sachin
AuthorAffiliation	1 University of Cambridge School of Clinical Medicine Cambridge United Kingdom 2 Attenborough Surgery Bushey Medical Centre Bushey United Kingdom
AuthorAffiliation_xml	– name: 2 Attenborough Surgery Bushey Medical Centre Bushey United Kingdom – name: 1 University of Cambridge School of Clinical Medicine Cambridge United Kingdom
Author_xml	– sequence: 1 givenname: Arun James orcidid: 0000-0001-8968-4768 surname: Thirunavukarasu fullname: Thirunavukarasu, Arun James – sequence: 2 givenname: Refaat orcidid: 0000-0002-3054-1161 surname: Hassan fullname: Hassan, Refaat – sequence: 3 givenname: Shathar orcidid: 0009-0008-4209-1306 surname: Mahmood fullname: Mahmood, Shathar – sequence: 4 givenname: Rohan orcidid: 0000-0001-6370-8426 surname: Sanghera fullname: Sanghera, Rohan – sequence: 5 givenname: Kara orcidid: 0009-0009-0327-1221 surname: Barzangi fullname: Barzangi, Kara – sequence: 6 givenname: Mohanned orcidid: 0009-0001-8158-0216 surname: El Mukashfi fullname: El Mukashfi, Mohanned – sequence: 7 givenname: Sachin orcidid: 0009-0008-2470-6143 surname: Shah fullname: Shah, Sachin
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/37083633$$D View this record in MEDLINE/PubMed
BookMark	eNpdkt9u0zAUxiM0xMboKyBLCGkIFZw4cWxu0FSgTBS1EkVcWo590rpy7eA4Q3sqXhH3z9C6G9s6_p3vfD4-z7Mz5x1k2SjH74qc0_clrTh_kl0UhPIxqWlx9uB8no36foMxzuuywBV_lp2TGjNCCbnI_i6DkdYat0ISzWRYQVrdapDp8N1rsOhqspZxuli-QcahKTgI0qJFkCoaBeiXiWsU14Cuu84a0Oib838s6JS-hD5-QPOmh3Aro_Eu5f2Ig75Dn2DrXR9Diqa6867zIQ7ORAM9kk6jmdmauE_pd0UXwWxluEMTGeBF9rSVtofRcb_Mfn75vJx8Hc_m05vJ9WysyorFsaItxbpgjSYUiha3imHW5AR4TTnnDW3LFK81YyVp6qKsK0kbzOuSK0xq0OQyuznoai83ojs4EF4asQ_4sBIypA5YECVuyrrVquUtLYliTELLic6LRjdY6yJpfTxodUOzBa3ApafbE9HTG2fWYuVvRY5zSkpMksLVUSH430Pqq9iaXoG10oEfelEwXGGSV5wl9NUjdOOHkHqfKJ7XjGNKd5ZePrT038v9YCTg7QFQwfd9gFao45ckh8Yma2I3eWI_eYl-_Yi-Fzzl_gGvg9fk
CitedBy_id	crossref_primary_10_1007_s00296_023_05464_6 crossref_primary_10_1007_s00264_023_06034_y crossref_primary_10_1056_AIcs2400661 crossref_primary_10_1136_fmch_2023_002626 crossref_primary_10_1016_j_remn_2024_500021 crossref_primary_10_2196_56930 crossref_primary_10_1016_j_apjo_2024_100089 crossref_primary_10_1016_j_ijnurstu_2024_104717 crossref_primary_10_2139_ssrn_4785683 crossref_primary_10_1136_bmjopen_2023_080558 crossref_primary_10_4103_sej_sej_107_24 crossref_primary_10_1038_s41591_023_02448_8 crossref_primary_10_1016_j_xcrm_2023_101230 crossref_primary_10_2196_56762 crossref_primary_10_1177_2333794X241240327 crossref_primary_10_1016_j_giec_2024_09_004 crossref_primary_10_1007_s00345_023_04749_6 crossref_primary_10_2196_56128 crossref_primary_10_1016_j_jbi_2024_104620 crossref_primary_10_2196_52202 crossref_primary_10_3390_biomedinformatics4020062 crossref_primary_10_1136_bmjopen_2024_086148 crossref_primary_10_7759_cureus_78433 crossref_primary_10_1371_journal_pone_0313442 crossref_primary_10_2196_51757 crossref_primary_10_1038_s41591_024_02970_3 crossref_primary_10_1371_journal_pdig_0000341 crossref_primary_10_2147_AMEP_S492895 crossref_primary_10_1186_s12909_024_05871_8 crossref_primary_10_3390_jpm13121681 crossref_primary_10_1007_s10462_024_10849_5 crossref_primary_10_3390_jcm13030735 crossref_primary_10_3389_frai_2023_1237704 crossref_primary_10_1111_bjh_19200 crossref_primary_10_2196_48291 crossref_primary_10_1038_s41591_024_03097_1 crossref_primary_10_1186_s12911_024_02757_z crossref_primary_10_1038_s41746_025_01546_w crossref_primary_10_1002_ijgo_15501 crossref_primary_10_1016_S2214_109X_23_00323_6 crossref_primary_10_2196_50658 crossref_primary_10_3389_fpsyg_2024_1488172 crossref_primary_10_52054_FVVO_16_4_052 crossref_primary_10_2196_22769 crossref_primary_10_1016_j_hansur_2023_06_005 crossref_primary_10_1016_j_csbj_2023_11_058 crossref_primary_10_2196_48978 crossref_primary_10_1016_j_arthro_2024_12_011 crossref_primary_10_1080_10447318_2024_2344142 crossref_primary_10_3390_info15110725 crossref_primary_10_1016_j_compbiomed_2023_107807 crossref_primary_10_2196_53225 crossref_primary_10_1016_j_xops_2023_100394 crossref_primary_10_1097_MCG_0000000000001979 crossref_primary_10_2196_49995 crossref_primary_10_2196_51603 crossref_primary_10_1177_01410768231173123 crossref_primary_10_1093_asj_sjad260 crossref_primary_10_1038_s41433_023_02619_4 crossref_primary_10_1097_MD_0000000000039250 crossref_primary_10_2196_59295 crossref_primary_10_2196_46800 crossref_primary_10_1056_AIra2400038 crossref_primary_10_2196_50357 crossref_primary_10_2196_54704 crossref_primary_10_1016_j_cjco_2025_02_012 crossref_primary_10_2196_48039 crossref_primary_10_1093_jamia_ocae131 crossref_primary_10_1111_imj_16393 crossref_primary_10_1007_s00405_024_08634_9 crossref_primary_10_1016_j_remnie_2024_500021 crossref_primary_10_1097_MS9_0000000000002716 crossref_primary_10_1177_20503121241257777 crossref_primary_10_1016_j_jid_2024_08_025 crossref_primary_10_1186_s12911_024_02709_7 crossref_primary_10_1016_j_esmorw_2023_09_001 crossref_primary_10_1097_IOP_0000000000002567 crossref_primary_10_1016_j_hansur_2024_101757 crossref_primary_10_2196_50965 crossref_primary_10_1016_j_jmir_2024_04_019 crossref_primary_10_35366_117512 crossref_primary_10_1016_j_dld_2024_02_017 crossref_primary_10_2196_48002 crossref_primary_10_1111_edt_12965 crossref_primary_10_3389_fpubh_2023_1225861 crossref_primary_10_3390_siuj5020018 crossref_primary_10_1007_s10755_025_09790_4 crossref_primary_10_1097_PHM_0000000000002440 crossref_primary_10_1111_iej_13985 crossref_primary_10_1016_j_jsurg_2024_103308 crossref_primary_10_1186_s12909_024_06232_1 crossref_primary_10_1016_j_semerg_2023_102069 crossref_primary_10_1002_hcs2_61 crossref_primary_10_1002_jso_27966 crossref_primary_10_1056_AIdbp2300192 crossref_primary_10_2196_48254 crossref_primary_10_3389_feduc_2024_1328769 crossref_primary_10_2215_CJN_0000000000000330 crossref_primary_10_1145_3641289
Cites_doi	10.48550/arXiv.2210.11416 10.2196/28916 10.2196/45312 10.48550/arXiv.2212.13138 10.1101/2023.01.23.23284735 10.1136/bmj.j3191 10.1016/s2589-7500(20)30287-9 10.3399/bjgp17x689929 10.2196/40946 10.31128/ajgp-03-18-4515 10.1101/2023.02.03.23285417 10.1101/2023.01.30.23285067 10.1136/medethics-2019-105472 10.1097/ACM.0000000000003943 10.3399/bjgpo.2021.0066 10.2196/27180 10.1097/JAC.0000000000000410 10.2196/20346 10.1038/s41746-022-00560-6 10.3390/healthcare10010099 10.1038/s41591-018-0316-z 10.1371/journal.pdig.0000198 10.1186/s12875-023-01973-2 10.2196/15185
ContentType	Journal Article
Copyright	Arun James Thirunavukarasu, Refaat Hassan, Shathar Mahmood, Rohan Sanghera, Kara Barzangi, Mohanned El Mukashfi, Sachin Shah. Originally published in JMIR Medical Education (https://mededu.jmir.org), 21.04.2023. 2023. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. Arun James Thirunavukarasu, Refaat Hassan, Shathar Mahmood, Rohan Sanghera, Kara Barzangi, Mohanned El Mukashfi, Sachin Shah. Originally published in JMIR Medical Education (https://mededu.jmir.org), 21.04.2023. 2023
Copyright_xml	– notice: Arun James Thirunavukarasu, Refaat Hassan, Shathar Mahmood, Rohan Sanghera, Kara Barzangi, Mohanned El Mukashfi, Sachin Shah. Originally published in JMIR Medical Education (https://mededu.jmir.org), 21.04.2023. – notice: 2023. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. – notice: Arun James Thirunavukarasu, Refaat Hassan, Shathar Mahmood, Rohan Sanghera, Kara Barzangi, Mohanned El Mukashfi, Sachin Shah. Originally published in JMIR Medical Education (https://mededu.jmir.org), 21.04.2023. 2023
DBID	AAYXX CITATION NPM 3V. 7X7 7XB 8FI 8FJ 8FK ABUWG AFKRA AZQEC BENPR CCPQU COVID DWQXO FYUFA GHDGH K9. M0S PHGZM PHGZT PIMPY PKEHL PQEST PQQKQ PQUKI PRINS 7X8 5PM DOA
DOI	10.2196/46599
DatabaseName	CrossRef PubMed ProQuest Central (Corporate) Health & Medical Collection ProQuest Central (purchase pre-March 2016) ProQuest Hospital Collection Hospital Premium Collection (Alumni Edition) ProQuest Central (Alumni) (purchase pre-March 2016) ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials ProQuest Central ProQuest One Community College Coronavirus Research Database ProQuest Central Korea Health Research Premium Collection Health Research Premium Collection (Alumni) ProQuest Health & Medical Complete (Alumni) ProQuest Health & Medical Collection ProQuest Central Premium ProQuest One Academic (New) Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China MEDLINE - Academic PubMed Central (Full Participant titles) DOAJ Directory of Open Access Journals
DatabaseTitle	CrossRef PubMed Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Health & Medical Complete (Alumni) Coronavirus Research Database ProQuest Central (Alumni Edition) ProQuest One Community College ProQuest Hospital Collection Health Research Premium Collection (Alumni) ProQuest Central China ProQuest Hospital Collection (Alumni) ProQuest Central ProQuest Health & Medical Complete Health Research Premium Collection ProQuest One Academic UKI Edition Health and Medicine Complete (Alumni Edition) ProQuest Central Korea ProQuest Central (New) ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni) MEDLINE - Academic
DatabaseTitleList	MEDLINE - Academic PubMed Publicly Available Content Database
Database_xml	– sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database
DeliveryMethod	fulltext_linktorsrc
EISSN	2369-3762
ExternalDocumentID	oai_doaj_org_article_40b47fdcf9f643c88aef93d12bdb0dd2 PMC10163403 37083633 10_2196_46599
Genre	Journal Article
GeographicLocations	United Kingdom--UK
GeographicLocations_xml	– name: United Kingdom--UK
GroupedDBID	7X7 8FI 8FJ AAFWJ AAHSB AAYXX ABUWG ADBBV AFKRA AFPKN ALIPV ALMA_UNASSIGNED_HOLDINGS AOIJS BCNDV BENPR CCPQU CITATION FYUFA GROUPED_DOAJ HMCUK HYE KQ8 M48 M~E OK1 PGMZT PHGZM PHGZT PIMPY RPM UKHRP NPM 3V. 7XB 8FK AZQEC COVID DWQXO K9. PKEHL PQEST PQQKQ PQUKI PRINS 7X8 5PM PUEGO
ID	FETCH-LOGICAL-c458t-c6f60d28bd36e2f0fc808b13e976999b6f46e27d8843b72475a6b09749c037ed3
IEDL.DBID	M48
ISSN	2369-3762
IngestDate	Wed Aug 27 01:32:15 EDT 2025 Thu Aug 21 18:37:21 EDT 2025 Thu Jul 10 23:58:36 EDT 2025 Mon Jun 30 13:06:11 EDT 2025 Mon Jul 21 06:08:57 EDT 2025 Tue Jul 01 02:28:25 EDT 2025 Thu Apr 24 22:59:42 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Keywords	chatbot deep learning decision support techniques natural language processing ChatGPT general practice AI primary care family medicine large language model artificial intelligence
Language	English
License	Arun James Thirunavukarasu, Refaat Hassan, Shathar Mahmood, Rohan Sanghera, Kara Barzangi, Mohanned El Mukashfi, Sachin Shah. Originally published in JMIR Medical Education (https://mededu.jmir.org), 21.04.2023. This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Education, is properly cited. The complete bibliographic information, a link to the original publication on https://mededu.jmir.org/, as well as this copyright and license information must be included.
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c458t-c6f60d28bd36e2f0fc808b13e976999b6f46e27d8843b72475a6b09749c037ed3
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ORCID	0000-0001-8968-4768 0000-0002-3054-1161 0000-0001-6370-8426 0009-0008-4209-1306 0009-0009-0327-1221 0009-0008-2470-6143 0009-0001-8158-0216
OpenAccessLink	https://www.proquest.com/docview/2917890662?pq-origsite=%requestingapplication%
PMID	37083633
PQID	2917890662
PQPubID	4997112
ParticipantIDs	doaj_primary_oai_doaj_org_article_40b47fdcf9f643c88aef93d12bdb0dd2 pubmedcentral_primary_oai_pubmedcentral_nih_gov_10163403 proquest_miscellaneous_2805031598 proquest_journals_2917890662 pubmed_primary_37083633 crossref_citationtrail_10_2196_46599 crossref_primary_10_2196_46599
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	20230421
PublicationDateYYYYMMDD	2023-04-21
PublicationDate_xml	– month: 4 year: 2023 text: 20230421 day: 21
PublicationDecade	2020
PublicationPlace	Canada
PublicationPlace_xml	– name: Canada – name: Toronto – name: Toronto, Canada
PublicationTitle	JMIR medical education
PublicationTitleAlternate	JMIR Med Educ
PublicationYear	2023
Publisher	JMIR Publications
Publisher_xml	– name: JMIR Publications
References	ref13 ref12 ref15 ref14 ref31 ref30 ref11 ref10 ref2 ref1 ref17 ref16 ref19 Nori, H (ref6) ref18 ref24 ref23 ref26 ref25 ref20 ref22 ref21 ref28 ref27 ref29 ref8 ref7 ref9 ref4 ref3 ref5
References_xml	– ident: ref26 doi: 10.48550/arXiv.2210.11416 – ident: ref21 doi: 10.2196/28916 – ident: ref6 publication-title: arXiv. – ident: ref16 doi: 10.2196/45312 – ident: ref17 doi: 10.48550/arXiv.2212.13138 – ident: ref19 doi: 10.1101/2023.01.23.23284735 – ident: ref7 doi: 10.1136/bmj.j3191 – ident: ref22 doi: 10.1016/s2589-7500(20)30287-9 – ident: ref10 doi: 10.3399/bjgp17x689929 – ident: ref25 doi: 10.2196/40946 – ident: ref8 doi: 10.31128/ajgp-03-18-4515 – ident: ref29 doi: 10.1101/2023.02.03.23285417 – ident: ref18 doi: 10.1101/2023.01.30.23285067 – ident: ref27 doi: 10.1136/medethics-2019-105472 – ident: ref15 – ident: ref4 doi: 10.1097/ACM.0000000000003943 – ident: ref13 – ident: ref2 – ident: ref11 doi: 10.3399/bjgpo.2021.0066 – ident: ref24 doi: 10.2196/27180 – ident: ref28 – ident: ref9 doi: 10.1097/JAC.0000000000000410 – ident: ref30 doi: 10.2196/20346 – ident: ref3 doi: 10.1038/s41746-022-00560-6 – ident: ref31 doi: 10.3390/healthcare10010099 – ident: ref1 doi: 10.1038/s41591-018-0316-z – ident: ref5 doi: 10.1371/journal.pdig.0000198 – ident: ref23 doi: 10.1186/s12875-023-01973-2 – ident: ref12 – ident: ref20 doi: 10.2196/15185 – ident: ref14
SSID	ssj0001742059
Score	2.5362597
Snippet	Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which... Background:Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5,... BackgroundLarge language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5,...
SourceID	doaj pubmedcentral proquest pubmed crossref
SourceType	Open Website Open Access Repository Aggregation Database Index Database Enrichment Source
StartPage	e46599
SubjectTerms	Accuracy Chatbots Deep learning Design Large language models Medical research Multimedia Observational studies Original Paper Primary care Statistical analysis
SummonAdditionalLinks	– databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9QwELZQDwgJIRCvQKkGqQc4RE1ix3a4QaFUvLqHregt8lO7UkmrNnvgV_EXmXGyYbdC4sIlh9hRxplx5pvx-DNj-0LVyhqrciddkQtdyrwxvslr5S26hxCbtND-9Zs8PhWfzuqzjaO-qCZsoAcePtyBKKxQ0bvYRHSeTmuDz3NfVtbbwvv090WftxFMpewKRnwIHG6zu1TrjFZ2IGSd-F3_OJ_E0f83YHmzPnLD4RzdZ_dGpAhvBwkfsFuhe8h-zclgaAs5GPhCVdx4HTKOQMeancOrw4XpP87mr2HZwUgqDbNxLxR8X_YLQMwHI_qEz-ucGsxRuDdwYqc0LT5HVYY_4X34QSiSbAXfe3JJkH3VJSpWMJ2HtElqyPzRS2cDgQXQ1qZH7PTow_zwOB_PXMidqHWPKouy8JW2nstQxSI6XWhb8oCwBbGklVHgfeW1FtyqClVtpC0wKGlcwVXw_DHb6S668JSBL62jjLAVCAGU0DYap8ooog0YB3GZsf21Mlo3iknnYpy3GJiQztqks4ztTd0uhwHc7PCONDk1EmF2uoFm1I5m1P7LjDK2u7aDdpzF122FsaxuiCM_Yy-nZpx_tKhiunCxwj6aGHUQFOqMPRnMZpKEK-L-5jxjesugtkTdbumWi8TxTUkVLgr-7H8M7jm7UyE2o0WwqtxlO_3VKrxALNXbvTRtfgNq0SLX priority: 102 providerName: Directory of Open Access Journals – databaseName: Health & Medical Collection dbid: 7X7 link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1Lb9QwELagSAgJIRCvlFINUg9wiJrETuxwQVAoFa_uYSv2FsUvdqWS3bbZA7-Kv8iM491lK8QlB9tRJpqx_c14_A1jB0KWUrdapqYyWSpUXqV1a-u0lFbj9uB8HQ7av36rTs7Ep0k5iQG3q5hWuVoTw0Jt54Zi5IcF-hWqJr7yN4uLlKpG0elqLKFxk90i6jJK6ZITuYmxoN-H8OE2u0sZz2hrh6IqA8vrZgsKTP3_gpfXsyT_2naO77N7ES_C20HBD9gN1z1kv8dkNnSRHFr4Qrnc-BzijkDFzc7h5dG07T-Oxq9g1kGkloZRvBEF32f9FBD5QcSg8HkVWYMxCvcaTvU6WIvvUa7hL3jvfhKWJIvB754uCLgvu0DICm1nIVyVGuJ_9NHRQGMBdMHpETs7_jA-Oklj5YXUiFL1qDhfZbZQ2vLKFT7zRmVK59wheEFEqSsvsF1apQTXskCFt5XO0DWpTcals_wx2-nmnXvKwObaUFxYCwQCUijtWyNzL7x26A3xKmEHK2U0JopJ1THOG3RPSGdN0FnC9tfDFsMPXB_wjjS57iTa7NAwv_zRxFnYiEwL6a3xtUckZpRq0Ri5zQttdWZtkbC9lR00cS5fNRvLS9iLdTfOQjpaaTs3X-IYRbw6CA1Vwp4MZrOWhEtiAOc8YWrLoLZE3e7pZtPA9E2hFS4yvvt_uZ6xOwViLzrkKvI9ttNfLt1zxEq93g8T4g8JDhjO priority: 102 providerName: ProQuest
Title	Trialling a Large Language Model (ChatGPT) in General Practice With the Applied Knowledge Test: Observational Study Demonstrating Opportunities and Limitations in Primary Care
URI	https://www.ncbi.nlm.nih.gov/pubmed/37083633 https://www.proquest.com/docview/2917890662 https://www.proquest.com/docview/2805031598 https://pubmed.ncbi.nlm.nih.gov/PMC10163403 https://doaj.org/article/40b47fdcf9f643c88aef93d12bdb0dd2
Volume	9
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Nb9QwELVoKyEkhEB8NLRERuoBDoHEdmIHCSFaWiqg7Qrtit6i2I7ZlZZsu2Ql-qv4i8w42aipeuCSQ-zIjmdGfjMevyFkT8hU6lLLyGQmjoRKsigvbR6l0mrYHiqX-4P2k9PseCK-nKfXsgm7Bfx9q2uH9aQmy_mbP5dXH8Dg32MaMyjQW5Gleb5BtmAzkmibJx3C92EWcP1iXzKN8SxHc2J3yf3Bl4MdyRP334Y2byZNXtuFjh6SBx18pB9beT8id6r6Mfk7Ri3Ce-W0pN8wtRuebRiSYq2zOX11MC2bz6Pxazqracc0TUfdBSn6Y9ZMKQBB2kFS-nUdaKNjmNw7eqb72C18h6mHV_RT9QuhJSoQjHt2gUu4qj0_Ky1rS_3NqTYciIOOWlYLivednpDJ0eH44DjqCjFERqSqATm6LLZMacuzirnYGRUrnfAKsAwATJ05Ae-lVUpwLRnIv8x0DJ5KbmIuK8ufks16UVfbhNpEGwwTawG4QAqlXWlk4oTTFThHPAvI3loYhemmicUy5gV4KyizwsssIGHf7aL9gZsd9lGSfSOyaPsXi-XPojPKQsRaSGeNyx0AM6NUCbrJbcK01bG1LCC7az0o1ppZMHBwVY7E-QF52TeDUeJJS1lXixX0UUizA0hRBeRZqzb9TLhEQnDOA6IGCjWY6rClnk098TdGWriI-fP_GHiH3GOAx_DgiyW7ZLNZrqoXgJ8aHZINeS5DsrV_eDr6HvooROit5h-k5CCG
linkProvider	Scholars Portal
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9NAEB6VVAIkhEC8DKUsUpHgYNX2brxrJIToi5SkaYRS0ZvxPkwiFSe0iVD_FPxFZvxISIW49eKDd22PNd_ufjO7MwOwJWRb6kxL38Qm8IUKYz_JbOK3pdW4PLg8KTfaj_px50R8Om2frsGvJhaGjlU2c2I5UduJIR_5doR2hUooX_n76Q-fqkbR7mpTQqOCRddd_kST7eLd4R7q91UUHewPdzt-XVXAN6KtZihUHgc2Utry2EV5kBsVKB1yhwszsiUd5wLvS6uU4FpG-DNZrAOk3YkJuHSW43tvwLrgaMq0YH1nvz_4vPTqoKWJhOUm3KEz1ojubRG3y7yyy0WvrA3wL0J79VzmXwvdwT24WzNU9qGC1H1Yc8UD-D0koFLoOstYj06P47XydDIqp3bGXu-OstnHwfANGxesTmbNBnUMFvsyno0Yck1Ws17WbXx5bIjCvWXHeuEexufodOMl23Pfib0SRvG7x1MyFeZFmQKWZYVlZXBW5XGkjw6qxBmMQqoewsm1aOURtIpJ4Z4As6E25InWAqmHFErnmZFhLnLt0P7isQdbjTJSU4tJ9TjOUjSISGdpqTMPNhfdptUPXO2wQ5pcNFKi7vLG5PxbWo_7VARayNyaPMmR-xmlMoQ_t2GkrQ6sjTzYaHCQ1rPHRbrEugcvF8047mkzJyvcZI59FGXyQTKqPHhcwWYhCZeUc5xzD9QKoFZEXW0pxqMytzg5c7gI-NP_y_UCbnWGR720d9jvPoPbETI_2mKLwg1ozc7n7jkytZnerIcHg6_XPSL_AJhKVgs
linkToPdf	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9NAEF6VIlVICIF4GUpZpCLBwYrtXXvXSAhBQ2hJaXNIRW6u90UiFSe0jlB_FXd-HTN-hVSIWy8-eNf2WDOz-83sPAjZ5SIWKlfC14kOfC7DxE9zk_qxMAq2B-vS6qD9y1Gyf8I_T-LJBvnd5sJgWGW7JlYLtZlr9JH3IrArZIr1ynuuCYsY9QfvFj987CCFJ61tO41aRIb28ieYbxdvD_rA65dRNPg43tv3mw4DvuaxLIFAlwQmksqwxEYucFoGUoXMwiYNyEkljsN9YaTkTIkIfixPVAAQPNUBE9YweO8NclOwOEQdExOx8u-AzQnQZYvcxmhrkPMeT-Kqwuxq-6u6BPwL2l6N0PxryxvcJXcarErf18J1j2zY4j75NUaRxSR2mtNDjCOHa-3zpNhY7Yy-2pvm5afR-DWdFbQpa01HTTYW_TorpxRQJ23wLx22Xj06BuLe0GPVOYrhOYxzvKR9-x1xLEorfPd4gUbDsqiKwdK8MLRK06p9j_jRUV1Cg2Jy1QNyci08eUg2i3lhHxNqQqXRJ604gBDBpXK5FqHjTlmwxFjikd2WGZluyMTOHGcZmEbIs6zimUd2ummL-geuTviAnOwGsWR3dWN-_i1rVoCMB4oLZ7RLHaBALWUOisBMGCmjAmMij2y3cpA168hFtpJ6j7zohmEFwGOdvLDzJcyRWNMHYKn0yKNabDpKmMDq44x5RK4J1Bqp6yPFbFpVGUe3DuMBe_J_up6TLdDD7PDgaPiU3IoAAuJZWxRuk83yfGmfAWQr1U6lG5ScXrcy_gF2N1jb
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Trialling+a+Large+Language+Model+%28ChatGPT%29+in+General+Practice+With+the+Applied+Knowledge+Test%3A+Observational+Study+Demonstrating+Opportunities+and+Limitations+in+Primary+Care&rft.jtitle=JMIR+medical+education&rft.au=Thirunavukarasu%2C+Arun+James&rft.au=Hassan%2C+Refaat&rft.au=Mahmood%2C+Shathar&rft.au=Sanghera%2C+Rohan&rft.date=2023-04-21&rft.issn=2369-3762&rft.eissn=2369-3762&rft.volume=9&rft.spage=e46599&rft_id=info:doi/10.2196%2F46599&rft.externalDBID=NO_FULL_TEXT
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2369-3762&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2369-3762&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2369-3762&client=summon