Trialling a Large Language Model (ChatGPT) in General Practice With the Applied Knowledge Test: Observational Study Demonstrating Opportunities and Limitations in Primary Care

Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be...

Full description

Saved in:
Bibliographic Details
Published inJMIR medical education Vol. 9; p. e46599
Main Authors Thirunavukarasu, Arun James, Hassan, Refaat, Mahmood, Shathar, Sanghera, Rohan, Barzangi, Kara, El Mukashfi, Mohanned, Shah, Sachin
Format Journal Article
LanguageEnglish
Published Canada JMIR Publications 21.04.2023
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners. Here, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium. AKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model's answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners' reports from 2018 to 2022. Novel explanations from ChatGPT-defined as information provided that was not inputted within the question or multiple answer choices-were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT's strengths and weaknesses. Average overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT's performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=-0.241 and -0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23). Large language models are approaching human expert-level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis.
AbstractList BackgroundLarge language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners. ObjectiveHere, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium. MethodsAKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model’s answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners’ reports from 2018 to 2022. Novel explanations from ChatGPT—defined as information provided that was not inputted within the question or multiple answer choices—were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT’s strengths and weaknesses. ResultsAverage overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT’s performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=–0.241 and –0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23). ConclusionsLarge language models are approaching human expert–level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis.
Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners.BACKGROUNDLarge language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners.Here, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium.OBJECTIVEHere, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium.AKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model's answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners' reports from 2018 to 2022. Novel explanations from ChatGPT-defined as information provided that was not inputted within the question or multiple answer choices-were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT's strengths and weaknesses.METHODSAKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model's answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners' reports from 2018 to 2022. Novel explanations from ChatGPT-defined as information provided that was not inputted within the question or multiple answer choices-were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT's strengths and weaknesses.Average overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT's performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=-0.241 and -0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23).RESULTSAverage overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT's performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=-0.241 and -0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23).Large language models are approaching human expert-level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis.CONCLUSIONSLarge language models are approaching human expert-level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis.
Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners. Here, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium. AKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model's answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners' reports from 2018 to 2022. Novel explanations from ChatGPT-defined as information provided that was not inputted within the question or multiple answer choices-were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT's strengths and weaknesses. Average overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT's performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=-0.241 and -0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23). Large language models are approaching human expert-level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis.
Background:Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners.Objective:Here, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium.Methods:AKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model’s answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners’ reports from 2018 to 2022. Novel explanations from ChatGPT—defined as information provided that was not inputted within the question or multiple answer choices—were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT’s strengths and weaknesses.Results:Average overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT’s performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=–0.241 and –0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23).Conclusions:Large language models are approaching human expert–level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis.
Author Sanghera, Rohan
Barzangi, Kara
Thirunavukarasu, Arun James
El Mukashfi, Mohanned
Mahmood, Shathar
Hassan, Refaat
Shah, Sachin
AuthorAffiliation 1 University of Cambridge School of Clinical Medicine Cambridge United Kingdom
2 Attenborough Surgery Bushey Medical Centre Bushey United Kingdom
AuthorAffiliation_xml – name: 2 Attenborough Surgery Bushey Medical Centre Bushey United Kingdom
– name: 1 University of Cambridge School of Clinical Medicine Cambridge United Kingdom
Author_xml – sequence: 1
  givenname: Arun James
  orcidid: 0000-0001-8968-4768
  surname: Thirunavukarasu
  fullname: Thirunavukarasu, Arun James
– sequence: 2
  givenname: Refaat
  orcidid: 0000-0002-3054-1161
  surname: Hassan
  fullname: Hassan, Refaat
– sequence: 3
  givenname: Shathar
  orcidid: 0009-0008-4209-1306
  surname: Mahmood
  fullname: Mahmood, Shathar
– sequence: 4
  givenname: Rohan
  orcidid: 0000-0001-6370-8426
  surname: Sanghera
  fullname: Sanghera, Rohan
– sequence: 5
  givenname: Kara
  orcidid: 0009-0009-0327-1221
  surname: Barzangi
  fullname: Barzangi, Kara
– sequence: 6
  givenname: Mohanned
  orcidid: 0009-0001-8158-0216
  surname: El Mukashfi
  fullname: El Mukashfi, Mohanned
– sequence: 7
  givenname: Sachin
  orcidid: 0009-0008-2470-6143
  surname: Shah
  fullname: Shah, Sachin
BackLink https://www.ncbi.nlm.nih.gov/pubmed/37083633$$D View this record in MEDLINE/PubMed
BookMark eNpdkt9u0zAUxiM0xMboKyBLCGkIFZw4cWxu0FSgTBS1EkVcWo590rpy7eA4Q3sqXhH3z9C6G9s6_p3vfD4-z7Mz5x1k2SjH74qc0_clrTh_kl0UhPIxqWlx9uB8no36foMxzuuywBV_lp2TGjNCCbnI_i6DkdYat0ISzWRYQVrdapDp8N1rsOhqspZxuli-QcahKTgI0qJFkCoaBeiXiWsU14Cuu84a0Oib838s6JS-hD5-QPOmh3Aro_Eu5f2Ig75Dn2DrXR9Diqa6867zIQ7ORAM9kk6jmdmauE_pd0UXwWxluEMTGeBF9rSVtofRcb_Mfn75vJx8Hc_m05vJ9WysyorFsaItxbpgjSYUiha3imHW5AR4TTnnDW3LFK81YyVp6qKsK0kbzOuSK0xq0OQyuznoai83ojs4EF4asQ_4sBIypA5YECVuyrrVquUtLYliTELLic6LRjdY6yJpfTxodUOzBa3ApafbE9HTG2fWYuVvRY5zSkpMksLVUSH430Pqq9iaXoG10oEfelEwXGGSV5wl9NUjdOOHkHqfKJ7XjGNKd5ZePrT038v9YCTg7QFQwfd9gFao45ckh8Yma2I3eWI_eYl-_Yi-Fzzl_gGvg9fk
CitedBy_id crossref_primary_10_1007_s00296_023_05464_6
crossref_primary_10_1007_s00264_023_06034_y
crossref_primary_10_1056_AIcs2400661
crossref_primary_10_1136_fmch_2023_002626
crossref_primary_10_1016_j_remn_2024_500021
crossref_primary_10_2196_56930
crossref_primary_10_1016_j_apjo_2024_100089
crossref_primary_10_1016_j_ijnurstu_2024_104717
crossref_primary_10_2139_ssrn_4785683
crossref_primary_10_1136_bmjopen_2023_080558
crossref_primary_10_4103_sej_sej_107_24
crossref_primary_10_1038_s41591_023_02448_8
crossref_primary_10_1016_j_xcrm_2023_101230
crossref_primary_10_2196_56762
crossref_primary_10_1177_2333794X241240327
crossref_primary_10_1016_j_giec_2024_09_004
crossref_primary_10_1007_s00345_023_04749_6
crossref_primary_10_2196_56128
crossref_primary_10_1016_j_jbi_2024_104620
crossref_primary_10_2196_52202
crossref_primary_10_3390_biomedinformatics4020062
crossref_primary_10_1136_bmjopen_2024_086148
crossref_primary_10_7759_cureus_78433
crossref_primary_10_1371_journal_pone_0313442
crossref_primary_10_2196_51757
crossref_primary_10_1038_s41591_024_02970_3
crossref_primary_10_1371_journal_pdig_0000341
crossref_primary_10_2147_AMEP_S492895
crossref_primary_10_1186_s12909_024_05871_8
crossref_primary_10_3390_jpm13121681
crossref_primary_10_1007_s10462_024_10849_5
crossref_primary_10_3390_jcm13030735
crossref_primary_10_3389_frai_2023_1237704
crossref_primary_10_1111_bjh_19200
crossref_primary_10_2196_48291
crossref_primary_10_1038_s41591_024_03097_1
crossref_primary_10_1186_s12911_024_02757_z
crossref_primary_10_1038_s41746_025_01546_w
crossref_primary_10_1002_ijgo_15501
crossref_primary_10_1016_S2214_109X_23_00323_6
crossref_primary_10_2196_50658
crossref_primary_10_3389_fpsyg_2024_1488172
crossref_primary_10_52054_FVVO_16_4_052
crossref_primary_10_2196_22769
crossref_primary_10_1016_j_hansur_2023_06_005
crossref_primary_10_1016_j_csbj_2023_11_058
crossref_primary_10_2196_48978
crossref_primary_10_1016_j_arthro_2024_12_011
crossref_primary_10_1080_10447318_2024_2344142
crossref_primary_10_3390_info15110725
crossref_primary_10_1016_j_compbiomed_2023_107807
crossref_primary_10_2196_53225
crossref_primary_10_1016_j_xops_2023_100394
crossref_primary_10_1097_MCG_0000000000001979
crossref_primary_10_2196_49995
crossref_primary_10_2196_51603
crossref_primary_10_1177_01410768231173123
crossref_primary_10_1093_asj_sjad260
crossref_primary_10_1038_s41433_023_02619_4
crossref_primary_10_1097_MD_0000000000039250
crossref_primary_10_2196_59295
crossref_primary_10_2196_46800
crossref_primary_10_1056_AIra2400038
crossref_primary_10_2196_50357
crossref_primary_10_2196_54704
crossref_primary_10_1016_j_cjco_2025_02_012
crossref_primary_10_2196_48039
crossref_primary_10_1093_jamia_ocae131
crossref_primary_10_1111_imj_16393
crossref_primary_10_1007_s00405_024_08634_9
crossref_primary_10_1016_j_remnie_2024_500021
crossref_primary_10_1097_MS9_0000000000002716
crossref_primary_10_1177_20503121241257777
crossref_primary_10_1016_j_jid_2024_08_025
crossref_primary_10_1186_s12911_024_02709_7
crossref_primary_10_1016_j_esmorw_2023_09_001
crossref_primary_10_1097_IOP_0000000000002567
crossref_primary_10_1016_j_hansur_2024_101757
crossref_primary_10_2196_50965
crossref_primary_10_1016_j_jmir_2024_04_019
crossref_primary_10_35366_117512
crossref_primary_10_1016_j_dld_2024_02_017
crossref_primary_10_2196_48002
crossref_primary_10_1111_edt_12965
crossref_primary_10_3389_fpubh_2023_1225861
crossref_primary_10_3390_siuj5020018
crossref_primary_10_1007_s10755_025_09790_4
crossref_primary_10_1097_PHM_0000000000002440
crossref_primary_10_1111_iej_13985
crossref_primary_10_1016_j_jsurg_2024_103308
crossref_primary_10_1186_s12909_024_06232_1
crossref_primary_10_1016_j_semerg_2023_102069
crossref_primary_10_1002_hcs2_61
crossref_primary_10_1002_jso_27966
crossref_primary_10_1056_AIdbp2300192
crossref_primary_10_2196_48254
crossref_primary_10_3389_feduc_2024_1328769
crossref_primary_10_2215_CJN_0000000000000330
crossref_primary_10_1145_3641289
Cites_doi 10.48550/arXiv.2210.11416
10.2196/28916
10.2196/45312
10.48550/arXiv.2212.13138
10.1101/2023.01.23.23284735
10.1136/bmj.j3191
10.1016/s2589-7500(20)30287-9
10.3399/bjgp17x689929
10.2196/40946
10.31128/ajgp-03-18-4515
10.1101/2023.02.03.23285417
10.1101/2023.01.30.23285067
10.1136/medethics-2019-105472
10.1097/ACM.0000000000003943
10.3399/bjgpo.2021.0066
10.2196/27180
10.1097/JAC.0000000000000410
10.2196/20346
10.1038/s41746-022-00560-6
10.3390/healthcare10010099
10.1038/s41591-018-0316-z
10.1371/journal.pdig.0000198
10.1186/s12875-023-01973-2
10.2196/15185
ContentType Journal Article
Copyright Arun James Thirunavukarasu, Refaat Hassan, Shathar Mahmood, Rohan Sanghera, Kara Barzangi, Mohanned El Mukashfi, Sachin Shah. Originally published in JMIR Medical Education (https://mededu.jmir.org), 21.04.2023.
2023. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Arun James Thirunavukarasu, Refaat Hassan, Shathar Mahmood, Rohan Sanghera, Kara Barzangi, Mohanned El Mukashfi, Sachin Shah. Originally published in JMIR Medical Education (https://mededu.jmir.org), 21.04.2023. 2023
Copyright_xml – notice: Arun James Thirunavukarasu, Refaat Hassan, Shathar Mahmood, Rohan Sanghera, Kara Barzangi, Mohanned El Mukashfi, Sachin Shah. Originally published in JMIR Medical Education (https://mededu.jmir.org), 21.04.2023.
– notice: 2023. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
– notice: Arun James Thirunavukarasu, Refaat Hassan, Shathar Mahmood, Rohan Sanghera, Kara Barzangi, Mohanned El Mukashfi, Sachin Shah. Originally published in JMIR Medical Education (https://mededu.jmir.org), 21.04.2023. 2023
DBID AAYXX
CITATION
NPM
3V.
7X7
7XB
8FI
8FJ
8FK
ABUWG
AFKRA
AZQEC
BENPR
CCPQU
COVID
DWQXO
FYUFA
GHDGH
K9.
M0S
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQQKQ
PQUKI
PRINS
7X8
5PM
DOA
DOI 10.2196/46599
DatabaseName CrossRef
PubMed
ProQuest Central (Corporate)
Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
ProQuest Hospital Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest Central Essentials
ProQuest Central
ProQuest One Community College
Coronavirus Research Database
ProQuest Central Korea
Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Health & Medical Complete (Alumni)
ProQuest Health & Medical Collection
ProQuest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
PubMed
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Health & Medical Complete (Alumni)
Coronavirus Research Database
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest Hospital Collection
Health Research Premium Collection (Alumni)
ProQuest Central China
ProQuest Hospital Collection (Alumni)
ProQuest Central
ProQuest Health & Medical Complete
Health Research Premium Collection
ProQuest One Academic UKI Edition
Health and Medicine Complete (Alumni Edition)
ProQuest Central Korea
ProQuest Central (New)
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList
MEDLINE - Academic
PubMed
Publicly Available Content Database
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: BENPR
  name: ProQuest Central
  url: https://www.proquest.com/central
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
EISSN 2369-3762
ExternalDocumentID oai_doaj_org_article_40b47fdcf9f643c88aef93d12bdb0dd2
PMC10163403
37083633
10_2196_46599
Genre Journal Article
GeographicLocations United Kingdom--UK
GeographicLocations_xml – name: United Kingdom--UK
GroupedDBID 7X7
8FI
8FJ
AAFWJ
AAHSB
AAYXX
ABUWG
ADBBV
AFKRA
AFPKN
ALIPV
ALMA_UNASSIGNED_HOLDINGS
AOIJS
BCNDV
BENPR
CCPQU
CITATION
FYUFA
GROUPED_DOAJ
HMCUK
HYE
KQ8
M48
M~E
OK1
PGMZT
PHGZM
PHGZT
PIMPY
RPM
UKHRP
NPM
3V.
7XB
8FK
AZQEC
COVID
DWQXO
K9.
PKEHL
PQEST
PQQKQ
PQUKI
PRINS
7X8
5PM
PUEGO
ID FETCH-LOGICAL-c458t-c6f60d28bd36e2f0fc808b13e976999b6f46e27d8843b72475a6b09749c037ed3
IEDL.DBID M48
ISSN 2369-3762
IngestDate Wed Aug 27 01:32:15 EDT 2025
Thu Aug 21 18:37:21 EDT 2025
Thu Jul 10 23:58:36 EDT 2025
Mon Jun 30 13:06:11 EDT 2025
Mon Jul 21 06:08:57 EDT 2025
Tue Jul 01 02:28:25 EDT 2025
Thu Apr 24 22:59:42 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords chatbot
deep learning
decision support techniques
natural language processing
ChatGPT
general practice
AI
primary care
family medicine
large language model
artificial intelligence
Language English
License Arun James Thirunavukarasu, Refaat Hassan, Shathar Mahmood, Rohan Sanghera, Kara Barzangi, Mohanned El Mukashfi, Sachin Shah. Originally published in JMIR Medical Education (https://mededu.jmir.org), 21.04.2023.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Education, is properly cited. The complete bibliographic information, a link to the original publication on https://mededu.jmir.org/, as well as this copyright and license information must be included.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c458t-c6f60d28bd36e2f0fc808b13e976999b6f46e27d8843b72475a6b09749c037ed3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ORCID 0000-0001-8968-4768
0000-0002-3054-1161
0000-0001-6370-8426
0009-0008-4209-1306
0009-0009-0327-1221
0009-0008-2470-6143
0009-0001-8158-0216
OpenAccessLink https://www.proquest.com/docview/2917890662?pq-origsite=%requestingapplication%
PMID 37083633
PQID 2917890662
PQPubID 4997112
ParticipantIDs doaj_primary_oai_doaj_org_article_40b47fdcf9f643c88aef93d12bdb0dd2
pubmedcentral_primary_oai_pubmedcentral_nih_gov_10163403
proquest_miscellaneous_2805031598
proquest_journals_2917890662
pubmed_primary_37083633
crossref_citationtrail_10_2196_46599
crossref_primary_10_2196_46599
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 20230421
PublicationDateYYYYMMDD 2023-04-21
PublicationDate_xml – month: 4
  year: 2023
  text: 20230421
  day: 21
PublicationDecade 2020
PublicationPlace Canada
PublicationPlace_xml – name: Canada
– name: Toronto
– name: Toronto, Canada
PublicationTitle JMIR medical education
PublicationTitleAlternate JMIR Med Educ
PublicationYear 2023
Publisher JMIR Publications
Publisher_xml – name: JMIR Publications
References ref13
ref12
ref15
ref14
ref31
ref30
ref11
ref10
ref2
ref1
ref17
ref16
ref19
Nori, H (ref6)
ref18
ref24
ref23
ref26
ref25
ref20
ref22
ref21
ref28
ref27
ref29
ref8
ref7
ref9
ref4
ref3
ref5
References_xml – ident: ref26
  doi: 10.48550/arXiv.2210.11416
– ident: ref21
  doi: 10.2196/28916
– ident: ref6
  publication-title: arXiv.
– ident: ref16
  doi: 10.2196/45312
– ident: ref17
  doi: 10.48550/arXiv.2212.13138
– ident: ref19
  doi: 10.1101/2023.01.23.23284735
– ident: ref7
  doi: 10.1136/bmj.j3191
– ident: ref22
  doi: 10.1016/s2589-7500(20)30287-9
– ident: ref10
  doi: 10.3399/bjgp17x689929
– ident: ref25
  doi: 10.2196/40946
– ident: ref8
  doi: 10.31128/ajgp-03-18-4515
– ident: ref29
  doi: 10.1101/2023.02.03.23285417
– ident: ref18
  doi: 10.1101/2023.01.30.23285067
– ident: ref27
  doi: 10.1136/medethics-2019-105472
– ident: ref15
– ident: ref4
  doi: 10.1097/ACM.0000000000003943
– ident: ref13
– ident: ref2
– ident: ref11
  doi: 10.3399/bjgpo.2021.0066
– ident: ref24
  doi: 10.2196/27180
– ident: ref28
– ident: ref9
  doi: 10.1097/JAC.0000000000000410
– ident: ref30
  doi: 10.2196/20346
– ident: ref3
  doi: 10.1038/s41746-022-00560-6
– ident: ref31
  doi: 10.3390/healthcare10010099
– ident: ref1
  doi: 10.1038/s41591-018-0316-z
– ident: ref5
  doi: 10.1371/journal.pdig.0000198
– ident: ref23
  doi: 10.1186/s12875-023-01973-2
– ident: ref12
– ident: ref20
  doi: 10.2196/15185
– ident: ref14
SSID ssj0001742059
Score 2.5362597
Snippet Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which...
Background:Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5,...
BackgroundLarge language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5,...
SourceID doaj
pubmedcentral
proquest
pubmed
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
StartPage e46599
SubjectTerms Accuracy
Chatbots
Deep learning
Design
Large language models
Medical research
Multimedia
Observational studies
Original Paper
Primary care
Statistical analysis
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9QwELZQDwgJIRCvQKkGqQc4RE1ix3a4QaFUvLqHregt8lO7UkmrNnvgV_EXmXGyYbdC4sIlh9hRxplx5pvx-DNj-0LVyhqrciddkQtdyrwxvslr5S26hxCbtND-9Zs8PhWfzuqzjaO-qCZsoAcePtyBKKxQ0bvYRHSeTmuDz3NfVtbbwvv090WftxFMpewKRnwIHG6zu1TrjFZ2IGSd-F3_OJ_E0f83YHmzPnLD4RzdZ_dGpAhvBwkfsFuhe8h-zclgaAs5GPhCVdx4HTKOQMeancOrw4XpP87mr2HZwUgqDbNxLxR8X_YLQMwHI_qEz-ucGsxRuDdwYqc0LT5HVYY_4X34QSiSbAXfe3JJkH3VJSpWMJ2HtElqyPzRS2cDgQXQ1qZH7PTow_zwOB_PXMidqHWPKouy8JW2nstQxSI6XWhb8oCwBbGklVHgfeW1FtyqClVtpC0wKGlcwVXw_DHb6S668JSBL62jjLAVCAGU0DYap8ooog0YB3GZsf21Mlo3iknnYpy3GJiQztqks4ztTd0uhwHc7PCONDk1EmF2uoFm1I5m1P7LjDK2u7aDdpzF122FsaxuiCM_Yy-nZpx_tKhiunCxwj6aGHUQFOqMPRnMZpKEK-L-5jxjesugtkTdbumWi8TxTUkVLgr-7H8M7jm7UyE2o0WwqtxlO_3VKrxALNXbvTRtfgNq0SLX
  priority: 102
  providerName: Directory of Open Access Journals
– databaseName: Health & Medical Collection
  dbid: 7X7
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1Lb9QwELagSAgJIRCvlFINUg9wiJrETuxwQVAoFa_uYSv2FsUvdqWS3bbZA7-Kv8iM491lK8QlB9tRJpqx_c14_A1jB0KWUrdapqYyWSpUXqV1a-u0lFbj9uB8HQ7av36rTs7Ep0k5iQG3q5hWuVoTw0Jt54Zi5IcF-hWqJr7yN4uLlKpG0elqLKFxk90i6jJK6ZITuYmxoN-H8OE2u0sZz2hrh6IqA8vrZgsKTP3_gpfXsyT_2naO77N7ES_C20HBD9gN1z1kv8dkNnSRHFr4Qrnc-BzijkDFzc7h5dG07T-Oxq9g1kGkloZRvBEF32f9FBD5QcSg8HkVWYMxCvcaTvU6WIvvUa7hL3jvfhKWJIvB754uCLgvu0DICm1nIVyVGuJ_9NHRQGMBdMHpETs7_jA-Oklj5YXUiFL1qDhfZbZQ2vLKFT7zRmVK59wheEFEqSsvsF1apQTXskCFt5XO0DWpTcals_wx2-nmnXvKwObaUFxYCwQCUijtWyNzL7x26A3xKmEHK2U0JopJ1THOG3RPSGdN0FnC9tfDFsMPXB_wjjS57iTa7NAwv_zRxFnYiEwL6a3xtUckZpRq0Ri5zQttdWZtkbC9lR00cS5fNRvLS9iLdTfOQjpaaTs3X-IYRbw6CA1Vwp4MZrOWhEtiAOc8YWrLoLZE3e7pZtPA9E2hFS4yvvt_uZ6xOwViLzrkKvI9ttNfLt1zxEq93g8T4g8JDhjO
  priority: 102
  providerName: ProQuest
Title Trialling a Large Language Model (ChatGPT) in General Practice With the Applied Knowledge Test: Observational Study Demonstrating Opportunities and Limitations in Primary Care
URI https://www.ncbi.nlm.nih.gov/pubmed/37083633
https://www.proquest.com/docview/2917890662
https://www.proquest.com/docview/2805031598
https://pubmed.ncbi.nlm.nih.gov/PMC10163403
https://doaj.org/article/40b47fdcf9f643c88aef93d12bdb0dd2
Volume 9
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Nb9QwELVoKyEkhEB8NLRERuoBDoHEdmIHCSFaWiqg7Qrtit6i2I7ZlZZsu2Ql-qv4i8w42aipeuCSQ-zIjmdGfjMevyFkT8hU6lLLyGQmjoRKsigvbR6l0mrYHiqX-4P2k9PseCK-nKfXsgm7Bfx9q2uH9aQmy_mbP5dXH8Dg32MaMyjQW5Gleb5BtmAzkmibJx3C92EWcP1iXzKN8SxHc2J3yf3Bl4MdyRP334Y2byZNXtuFjh6SBx18pB9beT8id6r6Mfk7Ri3Ce-W0pN8wtRuebRiSYq2zOX11MC2bz6Pxazqracc0TUfdBSn6Y9ZMKQBB2kFS-nUdaKNjmNw7eqb72C18h6mHV_RT9QuhJSoQjHt2gUu4qj0_Ky1rS_3NqTYciIOOWlYLivednpDJ0eH44DjqCjFERqSqATm6LLZMacuzirnYGRUrnfAKsAwATJ05Ae-lVUpwLRnIv8x0DJ5KbmIuK8ufks16UVfbhNpEGwwTawG4QAqlXWlk4oTTFThHPAvI3loYhemmicUy5gV4KyizwsssIGHf7aL9gZsd9lGSfSOyaPsXi-XPojPKQsRaSGeNyx0AM6NUCbrJbcK01bG1LCC7az0o1ppZMHBwVY7E-QF52TeDUeJJS1lXixX0UUizA0hRBeRZqzb9TLhEQnDOA6IGCjWY6rClnk098TdGWriI-fP_GHiH3GOAx_DgiyW7ZLNZrqoXgJ8aHZINeS5DsrV_eDr6HvooROit5h-k5CCG
linkProvider Scholars Portal
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9NAEB6VVAIkhEC8DKUsUpHgYNX2brxrJIToi5SkaYRS0ZvxPkwiFSe0iVD_FPxFZvxISIW49eKDd22PNd_ufjO7MwOwJWRb6kxL38Qm8IUKYz_JbOK3pdW4PLg8KTfaj_px50R8Om2frsGvJhaGjlU2c2I5UduJIR_5doR2hUooX_n76Q-fqkbR7mpTQqOCRddd_kST7eLd4R7q91UUHewPdzt-XVXAN6KtZihUHgc2Utry2EV5kBsVKB1yhwszsiUd5wLvS6uU4FpG-DNZrAOk3YkJuHSW43tvwLrgaMq0YH1nvz_4vPTqoKWJhOUm3KEz1ojubRG3y7yyy0WvrA3wL0J79VzmXwvdwT24WzNU9qGC1H1Yc8UD-D0koFLoOstYj06P47XydDIqp3bGXu-OstnHwfANGxesTmbNBnUMFvsyno0Yck1Ws17WbXx5bIjCvWXHeuEexufodOMl23Pfib0SRvG7x1MyFeZFmQKWZYVlZXBW5XGkjw6qxBmMQqoewsm1aOURtIpJ4Z4As6E25InWAqmHFErnmZFhLnLt0P7isQdbjTJSU4tJ9TjOUjSISGdpqTMPNhfdptUPXO2wQ5pcNFKi7vLG5PxbWo_7VARayNyaPMmR-xmlMoQ_t2GkrQ6sjTzYaHCQ1rPHRbrEugcvF8047mkzJyvcZI59FGXyQTKqPHhcwWYhCZeUc5xzD9QKoFZEXW0pxqMytzg5c7gI-NP_y_UCbnWGR720d9jvPoPbETI_2mKLwg1ozc7n7jkytZnerIcHg6_XPSL_AJhKVgs
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9NAEF6VIlVICIF4GUpZpCLBwYrtXXvXSAhBQ2hJaXNIRW6u90UiFSe0jlB_FXd-HTN-hVSIWy8-eNf2WDOz-83sPAjZ5SIWKlfC14kOfC7DxE9zk_qxMAq2B-vS6qD9y1Gyf8I_T-LJBvnd5sJgWGW7JlYLtZlr9JH3IrArZIr1ynuuCYsY9QfvFj987CCFJ61tO41aRIb28ieYbxdvD_rA65dRNPg43tv3mw4DvuaxLIFAlwQmksqwxEYucFoGUoXMwiYNyEkljsN9YaTkTIkIfixPVAAQPNUBE9YweO8NclOwOEQdExOx8u-AzQnQZYvcxmhrkPMeT-Kqwuxq-6u6BPwL2l6N0PxryxvcJXcarErf18J1j2zY4j75NUaRxSR2mtNDjCOHa-3zpNhY7Yy-2pvm5afR-DWdFbQpa01HTTYW_TorpxRQJ23wLx22Xj06BuLe0GPVOYrhOYxzvKR9-x1xLEorfPd4gUbDsqiKwdK8MLRK06p9j_jRUV1Cg2Jy1QNyci08eUg2i3lhHxNqQqXRJ604gBDBpXK5FqHjTlmwxFjikd2WGZluyMTOHGcZmEbIs6zimUd2ummL-geuTviAnOwGsWR3dWN-_i1rVoCMB4oLZ7RLHaBALWUOisBMGCmjAmMij2y3cpA168hFtpJ6j7zohmEFwGOdvLDzJcyRWNMHYKn0yKNabDpKmMDq44x5RK4J1Bqp6yPFbFpVGUe3DuMBe_J_up6TLdDD7PDgaPiU3IoAAuJZWxRuk83yfGmfAWQr1U6lG5ScXrcy_gF2N1jb
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Trialling+a+Large+Language+Model+%28ChatGPT%29+in+General+Practice+With+the+Applied+Knowledge+Test%3A+Observational+Study+Demonstrating+Opportunities+and+Limitations+in+Primary+Care&rft.jtitle=JMIR+medical+education&rft.au=Thirunavukarasu%2C+Arun+James&rft.au=Hassan%2C+Refaat&rft.au=Mahmood%2C+Shathar&rft.au=Sanghera%2C+Rohan&rft.date=2023-04-21&rft.issn=2369-3762&rft.eissn=2369-3762&rft.volume=9&rft.spage=e46599&rft_id=info:doi/10.2196%2F46599&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2369-3762&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2369-3762&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2369-3762&client=summon