Trialling a Large Language Model (ChatGPT) in General Practice With the Applied Knowledge Test: Observational Study Demonstrating Opportunities and Limitations in Primary Care
Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be...
Saved in:
Published in | JMIR medical education Vol. 9; p. e46599 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
Canada
JMIR Publications
21.04.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners.
Here, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium.
AKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model's answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners' reports from 2018 to 2022. Novel explanations from ChatGPT-defined as information provided that was not inputted within the question or multiple answer choices-were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT's strengths and weaknesses.
Average overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT's performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=-0.241 and -0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23).
Large language models are approaching human expert-level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis. |
---|---|
AbstractList | BackgroundLarge language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners. ObjectiveHere, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium. MethodsAKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model’s answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners’ reports from 2018 to 2022. Novel explanations from ChatGPT—defined as information provided that was not inputted within the question or multiple answer choices—were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT’s strengths and weaknesses. ResultsAverage overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT’s performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=–0.241 and –0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23). ConclusionsLarge language models are approaching human expert–level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis. Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners.BACKGROUNDLarge language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners.Here, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium.OBJECTIVEHere, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium.AKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model's answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners' reports from 2018 to 2022. Novel explanations from ChatGPT-defined as information provided that was not inputted within the question or multiple answer choices-were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT's strengths and weaknesses.METHODSAKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model's answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners' reports from 2018 to 2022. Novel explanations from ChatGPT-defined as information provided that was not inputted within the question or multiple answer choices-were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT's strengths and weaknesses.Average overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT's performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=-0.241 and -0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23).RESULTSAverage overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT's performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=-0.241 and -0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23).Large language models are approaching human expert-level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis.CONCLUSIONSLarge language models are approaching human expert-level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis. Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners. Here, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium. AKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model's answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners' reports from 2018 to 2022. Novel explanations from ChatGPT-defined as information provided that was not inputted within the question or multiple answer choices-were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT's strengths and weaknesses. Average overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT's performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=-0.241 and -0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23). Large language models are approaching human expert-level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis. Background:Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners.Objective:Here, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium.Methods:AKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model’s answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners’ reports from 2018 to 2022. Novel explanations from ChatGPT—defined as information provided that was not inputted within the question or multiple answer choices—were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT’s strengths and weaknesses.Results:Average overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT’s performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=–0.241 and –0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23).Conclusions:Large language models are approaching human expert–level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis. |
Author | Sanghera, Rohan Barzangi, Kara Thirunavukarasu, Arun James El Mukashfi, Mohanned Mahmood, Shathar Hassan, Refaat Shah, Sachin |
AuthorAffiliation | 1 University of Cambridge School of Clinical Medicine Cambridge United Kingdom 2 Attenborough Surgery Bushey Medical Centre Bushey United Kingdom |
AuthorAffiliation_xml | – name: 2 Attenborough Surgery Bushey Medical Centre Bushey United Kingdom – name: 1 University of Cambridge School of Clinical Medicine Cambridge United Kingdom |
Author_xml | – sequence: 1 givenname: Arun James orcidid: 0000-0001-8968-4768 surname: Thirunavukarasu fullname: Thirunavukarasu, Arun James – sequence: 2 givenname: Refaat orcidid: 0000-0002-3054-1161 surname: Hassan fullname: Hassan, Refaat – sequence: 3 givenname: Shathar orcidid: 0009-0008-4209-1306 surname: Mahmood fullname: Mahmood, Shathar – sequence: 4 givenname: Rohan orcidid: 0000-0001-6370-8426 surname: Sanghera fullname: Sanghera, Rohan – sequence: 5 givenname: Kara orcidid: 0009-0009-0327-1221 surname: Barzangi fullname: Barzangi, Kara – sequence: 6 givenname: Mohanned orcidid: 0009-0001-8158-0216 surname: El Mukashfi fullname: El Mukashfi, Mohanned – sequence: 7 givenname: Sachin orcidid: 0009-0008-2470-6143 surname: Shah fullname: Shah, Sachin |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/37083633$$D View this record in MEDLINE/PubMed |
BookMark | eNpdkt9u0zAUxiM0xMboKyBLCGkIFZw4cWxu0FSgTBS1EkVcWo590rpy7eA4Q3sqXhH3z9C6G9s6_p3vfD4-z7Mz5x1k2SjH74qc0_clrTh_kl0UhPIxqWlx9uB8no36foMxzuuywBV_lp2TGjNCCbnI_i6DkdYat0ISzWRYQVrdapDp8N1rsOhqspZxuli-QcahKTgI0qJFkCoaBeiXiWsU14Cuu84a0Oib838s6JS-hD5-QPOmh3Aro_Eu5f2Ig75Dn2DrXR9Diqa6867zIQ7ORAM9kk6jmdmauE_pd0UXwWxluEMTGeBF9rSVtofRcb_Mfn75vJx8Hc_m05vJ9WysyorFsaItxbpgjSYUiha3imHW5AR4TTnnDW3LFK81YyVp6qKsK0kbzOuSK0xq0OQyuznoai83ojs4EF4asQ_4sBIypA5YECVuyrrVquUtLYliTELLic6LRjdY6yJpfTxodUOzBa3ApafbE9HTG2fWYuVvRY5zSkpMksLVUSH430Pqq9iaXoG10oEfelEwXGGSV5wl9NUjdOOHkHqfKJ7XjGNKd5ZePrT038v9YCTg7QFQwfd9gFao45ckh8Yma2I3eWI_eYl-_Yi-Fzzl_gGvg9fk |
CitedBy_id | crossref_primary_10_1007_s00296_023_05464_6 crossref_primary_10_1007_s00264_023_06034_y crossref_primary_10_1056_AIcs2400661 crossref_primary_10_1136_fmch_2023_002626 crossref_primary_10_1016_j_remn_2024_500021 crossref_primary_10_2196_56930 crossref_primary_10_1016_j_apjo_2024_100089 crossref_primary_10_1016_j_ijnurstu_2024_104717 crossref_primary_10_2139_ssrn_4785683 crossref_primary_10_1136_bmjopen_2023_080558 crossref_primary_10_4103_sej_sej_107_24 crossref_primary_10_1038_s41591_023_02448_8 crossref_primary_10_1016_j_xcrm_2023_101230 crossref_primary_10_2196_56762 crossref_primary_10_1177_2333794X241240327 crossref_primary_10_1016_j_giec_2024_09_004 crossref_primary_10_1007_s00345_023_04749_6 crossref_primary_10_2196_56128 crossref_primary_10_1016_j_jbi_2024_104620 crossref_primary_10_2196_52202 crossref_primary_10_3390_biomedinformatics4020062 crossref_primary_10_1136_bmjopen_2024_086148 crossref_primary_10_7759_cureus_78433 crossref_primary_10_1371_journal_pone_0313442 crossref_primary_10_2196_51757 crossref_primary_10_1038_s41591_024_02970_3 crossref_primary_10_1371_journal_pdig_0000341 crossref_primary_10_2147_AMEP_S492895 crossref_primary_10_1186_s12909_024_05871_8 crossref_primary_10_3390_jpm13121681 crossref_primary_10_1007_s10462_024_10849_5 crossref_primary_10_3390_jcm13030735 crossref_primary_10_3389_frai_2023_1237704 crossref_primary_10_1111_bjh_19200 crossref_primary_10_2196_48291 crossref_primary_10_1038_s41591_024_03097_1 crossref_primary_10_1186_s12911_024_02757_z crossref_primary_10_1038_s41746_025_01546_w crossref_primary_10_1002_ijgo_15501 crossref_primary_10_1016_S2214_109X_23_00323_6 crossref_primary_10_2196_50658 crossref_primary_10_3389_fpsyg_2024_1488172 crossref_primary_10_52054_FVVO_16_4_052 crossref_primary_10_2196_22769 crossref_primary_10_1016_j_hansur_2023_06_005 crossref_primary_10_1016_j_csbj_2023_11_058 crossref_primary_10_2196_48978 crossref_primary_10_1016_j_arthro_2024_12_011 crossref_primary_10_1080_10447318_2024_2344142 crossref_primary_10_3390_info15110725 crossref_primary_10_1016_j_compbiomed_2023_107807 crossref_primary_10_2196_53225 crossref_primary_10_1016_j_xops_2023_100394 crossref_primary_10_1097_MCG_0000000000001979 crossref_primary_10_2196_49995 crossref_primary_10_2196_51603 crossref_primary_10_1177_01410768231173123 crossref_primary_10_1093_asj_sjad260 crossref_primary_10_1038_s41433_023_02619_4 crossref_primary_10_1097_MD_0000000000039250 crossref_primary_10_2196_59295 crossref_primary_10_2196_46800 crossref_primary_10_1056_AIra2400038 crossref_primary_10_2196_50357 crossref_primary_10_2196_54704 crossref_primary_10_1016_j_cjco_2025_02_012 crossref_primary_10_2196_48039 crossref_primary_10_1093_jamia_ocae131 crossref_primary_10_1111_imj_16393 crossref_primary_10_1007_s00405_024_08634_9 crossref_primary_10_1016_j_remnie_2024_500021 crossref_primary_10_1097_MS9_0000000000002716 crossref_primary_10_1177_20503121241257777 crossref_primary_10_1016_j_jid_2024_08_025 crossref_primary_10_1186_s12911_024_02709_7 crossref_primary_10_1016_j_esmorw_2023_09_001 crossref_primary_10_1097_IOP_0000000000002567 crossref_primary_10_1016_j_hansur_2024_101757 crossref_primary_10_2196_50965 crossref_primary_10_1016_j_jmir_2024_04_019 crossref_primary_10_35366_117512 crossref_primary_10_1016_j_dld_2024_02_017 crossref_primary_10_2196_48002 crossref_primary_10_1111_edt_12965 crossref_primary_10_3389_fpubh_2023_1225861 crossref_primary_10_3390_siuj5020018 crossref_primary_10_1007_s10755_025_09790_4 crossref_primary_10_1097_PHM_0000000000002440 crossref_primary_10_1111_iej_13985 crossref_primary_10_1016_j_jsurg_2024_103308 crossref_primary_10_1186_s12909_024_06232_1 crossref_primary_10_1016_j_semerg_2023_102069 crossref_primary_10_1002_hcs2_61 crossref_primary_10_1002_jso_27966 crossref_primary_10_1056_AIdbp2300192 crossref_primary_10_2196_48254 crossref_primary_10_3389_feduc_2024_1328769 crossref_primary_10_2215_CJN_0000000000000330 crossref_primary_10_1145_3641289 |
Cites_doi | 10.48550/arXiv.2210.11416 10.2196/28916 10.2196/45312 10.48550/arXiv.2212.13138 10.1101/2023.01.23.23284735 10.1136/bmj.j3191 10.1016/s2589-7500(20)30287-9 10.3399/bjgp17x689929 10.2196/40946 10.31128/ajgp-03-18-4515 10.1101/2023.02.03.23285417 10.1101/2023.01.30.23285067 10.1136/medethics-2019-105472 10.1097/ACM.0000000000003943 10.3399/bjgpo.2021.0066 10.2196/27180 10.1097/JAC.0000000000000410 10.2196/20346 10.1038/s41746-022-00560-6 10.3390/healthcare10010099 10.1038/s41591-018-0316-z 10.1371/journal.pdig.0000198 10.1186/s12875-023-01973-2 10.2196/15185 |
ContentType | Journal Article |
Copyright | Arun James Thirunavukarasu, Refaat Hassan, Shathar Mahmood, Rohan Sanghera, Kara Barzangi, Mohanned El Mukashfi, Sachin Shah. Originally published in JMIR Medical Education (https://mededu.jmir.org), 21.04.2023. 2023. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. Arun James Thirunavukarasu, Refaat Hassan, Shathar Mahmood, Rohan Sanghera, Kara Barzangi, Mohanned El Mukashfi, Sachin Shah. Originally published in JMIR Medical Education (https://mededu.jmir.org), 21.04.2023. 2023 |
Copyright_xml | – notice: Arun James Thirunavukarasu, Refaat Hassan, Shathar Mahmood, Rohan Sanghera, Kara Barzangi, Mohanned El Mukashfi, Sachin Shah. Originally published in JMIR Medical Education (https://mededu.jmir.org), 21.04.2023. – notice: 2023. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. – notice: Arun James Thirunavukarasu, Refaat Hassan, Shathar Mahmood, Rohan Sanghera, Kara Barzangi, Mohanned El Mukashfi, Sachin Shah. Originally published in JMIR Medical Education (https://mededu.jmir.org), 21.04.2023. 2023 |
DBID | AAYXX CITATION NPM 3V. 7X7 7XB 8FI 8FJ 8FK ABUWG AFKRA AZQEC BENPR CCPQU COVID DWQXO FYUFA GHDGH K9. M0S PHGZM PHGZT PIMPY PKEHL PQEST PQQKQ PQUKI PRINS 7X8 5PM DOA |
DOI | 10.2196/46599 |
DatabaseName | CrossRef PubMed ProQuest Central (Corporate) Health & Medical Collection ProQuest Central (purchase pre-March 2016) ProQuest Hospital Collection Hospital Premium Collection (Alumni Edition) ProQuest Central (Alumni) (purchase pre-March 2016) ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials ProQuest Central ProQuest One Community College Coronavirus Research Database ProQuest Central Korea Health Research Premium Collection Health Research Premium Collection (Alumni) ProQuest Health & Medical Complete (Alumni) ProQuest Health & Medical Collection ProQuest Central Premium ProQuest One Academic (New) Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China MEDLINE - Academic PubMed Central (Full Participant titles) DOAJ Directory of Open Access Journals |
DatabaseTitle | CrossRef PubMed Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Health & Medical Complete (Alumni) Coronavirus Research Database ProQuest Central (Alumni Edition) ProQuest One Community College ProQuest Hospital Collection Health Research Premium Collection (Alumni) ProQuest Central China ProQuest Hospital Collection (Alumni) ProQuest Central ProQuest Health & Medical Complete Health Research Premium Collection ProQuest One Academic UKI Edition Health and Medicine Complete (Alumni Edition) ProQuest Central Korea ProQuest Central (New) ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni) MEDLINE - Academic |
DatabaseTitleList | MEDLINE - Academic PubMed Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
EISSN | 2369-3762 |
ExternalDocumentID | oai_doaj_org_article_40b47fdcf9f643c88aef93d12bdb0dd2 PMC10163403 37083633 10_2196_46599 |
Genre | Journal Article |
GeographicLocations | United Kingdom--UK |
GeographicLocations_xml | – name: United Kingdom--UK |
GroupedDBID | 7X7 8FI 8FJ AAFWJ AAHSB AAYXX ABUWG ADBBV AFKRA AFPKN ALIPV ALMA_UNASSIGNED_HOLDINGS AOIJS BCNDV BENPR CCPQU CITATION FYUFA GROUPED_DOAJ HMCUK HYE KQ8 M48 M~E OK1 PGMZT PHGZM PHGZT PIMPY RPM UKHRP NPM 3V. 7XB 8FK AZQEC COVID DWQXO K9. PKEHL PQEST PQQKQ PQUKI PRINS 7X8 5PM PUEGO |
ID | FETCH-LOGICAL-c458t-c6f60d28bd36e2f0fc808b13e976999b6f46e27d8843b72475a6b09749c037ed3 |
IEDL.DBID | M48 |
ISSN | 2369-3762 |
IngestDate | Wed Aug 27 01:32:15 EDT 2025 Thu Aug 21 18:37:21 EDT 2025 Thu Jul 10 23:58:36 EDT 2025 Mon Jun 30 13:06:11 EDT 2025 Mon Jul 21 06:08:57 EDT 2025 Tue Jul 01 02:28:25 EDT 2025 Thu Apr 24 22:59:42 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Keywords | chatbot deep learning decision support techniques natural language processing ChatGPT general practice AI primary care family medicine large language model artificial intelligence |
Language | English |
License | Arun James Thirunavukarasu, Refaat Hassan, Shathar Mahmood, Rohan Sanghera, Kara Barzangi, Mohanned El Mukashfi, Sachin Shah. Originally published in JMIR Medical Education (https://mededu.jmir.org), 21.04.2023. This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Education, is properly cited. The complete bibliographic information, a link to the original publication on https://mededu.jmir.org/, as well as this copyright and license information must be included. |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c458t-c6f60d28bd36e2f0fc808b13e976999b6f46e27d8843b72475a6b09749c037ed3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ORCID | 0000-0001-8968-4768 0000-0002-3054-1161 0000-0001-6370-8426 0009-0008-4209-1306 0009-0009-0327-1221 0009-0008-2470-6143 0009-0001-8158-0216 |
OpenAccessLink | https://www.proquest.com/docview/2917890662?pq-origsite=%requestingapplication% |
PMID | 37083633 |
PQID | 2917890662 |
PQPubID | 4997112 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_40b47fdcf9f643c88aef93d12bdb0dd2 pubmedcentral_primary_oai_pubmedcentral_nih_gov_10163403 proquest_miscellaneous_2805031598 proquest_journals_2917890662 pubmed_primary_37083633 crossref_citationtrail_10_2196_46599 crossref_primary_10_2196_46599 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 20230421 |
PublicationDateYYYYMMDD | 2023-04-21 |
PublicationDate_xml | – month: 4 year: 2023 text: 20230421 day: 21 |
PublicationDecade | 2020 |
PublicationPlace | Canada |
PublicationPlace_xml | – name: Canada – name: Toronto – name: Toronto, Canada |
PublicationTitle | JMIR medical education |
PublicationTitleAlternate | JMIR Med Educ |
PublicationYear | 2023 |
Publisher | JMIR Publications |
Publisher_xml | – name: JMIR Publications |
References | ref13 ref12 ref15 ref14 ref31 ref30 ref11 ref10 ref2 ref1 ref17 ref16 ref19 Nori, H (ref6) ref18 ref24 ref23 ref26 ref25 ref20 ref22 ref21 ref28 ref27 ref29 ref8 ref7 ref9 ref4 ref3 ref5 |
References_xml | – ident: ref26 doi: 10.48550/arXiv.2210.11416 – ident: ref21 doi: 10.2196/28916 – ident: ref6 publication-title: arXiv. – ident: ref16 doi: 10.2196/45312 – ident: ref17 doi: 10.48550/arXiv.2212.13138 – ident: ref19 doi: 10.1101/2023.01.23.23284735 – ident: ref7 doi: 10.1136/bmj.j3191 – ident: ref22 doi: 10.1016/s2589-7500(20)30287-9 – ident: ref10 doi: 10.3399/bjgp17x689929 – ident: ref25 doi: 10.2196/40946 – ident: ref8 doi: 10.31128/ajgp-03-18-4515 – ident: ref29 doi: 10.1101/2023.02.03.23285417 – ident: ref18 doi: 10.1101/2023.01.30.23285067 – ident: ref27 doi: 10.1136/medethics-2019-105472 – ident: ref15 – ident: ref4 doi: 10.1097/ACM.0000000000003943 – ident: ref13 – ident: ref2 – ident: ref11 doi: 10.3399/bjgpo.2021.0066 – ident: ref24 doi: 10.2196/27180 – ident: ref28 – ident: ref9 doi: 10.1097/JAC.0000000000000410 – ident: ref30 doi: 10.2196/20346 – ident: ref3 doi: 10.1038/s41746-022-00560-6 – ident: ref31 doi: 10.3390/healthcare10010099 – ident: ref1 doi: 10.1038/s41591-018-0316-z – ident: ref5 doi: 10.1371/journal.pdig.0000198 – ident: ref23 doi: 10.1186/s12875-023-01973-2 – ident: ref12 – ident: ref20 doi: 10.2196/15185 – ident: ref14 |
SSID | ssj0001742059 |
Score | 2.5362597 |
Snippet | Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which... Background:Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5,... BackgroundLarge language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5,... |
SourceID | doaj pubmedcentral proquest pubmed crossref |
SourceType | Open Website Open Access Repository Aggregation Database Index Database Enrichment Source |
StartPage | e46599 |
SubjectTerms | Accuracy Chatbots Deep learning Design Large language models Medical research Multimedia Observational studies Original Paper Primary care Statistical analysis |
SummonAdditionalLinks | – databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9QwELZQDwgJIRCvQKkGqQc4RE1ix3a4QaFUvLqHregt8lO7UkmrNnvgV_EXmXGyYbdC4sIlh9hRxplx5pvx-DNj-0LVyhqrciddkQtdyrwxvslr5S26hxCbtND-9Zs8PhWfzuqzjaO-qCZsoAcePtyBKKxQ0bvYRHSeTmuDz3NfVtbbwvv090WftxFMpewKRnwIHG6zu1TrjFZ2IGSd-F3_OJ_E0f83YHmzPnLD4RzdZ_dGpAhvBwkfsFuhe8h-zclgaAs5GPhCVdx4HTKOQMeancOrw4XpP87mr2HZwUgqDbNxLxR8X_YLQMwHI_qEz-ucGsxRuDdwYqc0LT5HVYY_4X34QSiSbAXfe3JJkH3VJSpWMJ2HtElqyPzRS2cDgQXQ1qZH7PTow_zwOB_PXMidqHWPKouy8JW2nstQxSI6XWhb8oCwBbGklVHgfeW1FtyqClVtpC0wKGlcwVXw_DHb6S668JSBL62jjLAVCAGU0DYap8ooog0YB3GZsf21Mlo3iknnYpy3GJiQztqks4ztTd0uhwHc7PCONDk1EmF2uoFm1I5m1P7LjDK2u7aDdpzF122FsaxuiCM_Yy-nZpx_tKhiunCxwj6aGHUQFOqMPRnMZpKEK-L-5jxjesugtkTdbumWi8TxTUkVLgr-7H8M7jm7UyE2o0WwqtxlO_3VKrxALNXbvTRtfgNq0SLX priority: 102 providerName: Directory of Open Access Journals – databaseName: Health & Medical Collection dbid: 7X7 link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1Lb9QwELagSAgJIRCvlFINUg9wiJrETuxwQVAoFa_uYSv2FsUvdqWS3bbZA7-Kv8iM491lK8QlB9tRJpqx_c14_A1jB0KWUrdapqYyWSpUXqV1a-u0lFbj9uB8HQ7av36rTs7Ep0k5iQG3q5hWuVoTw0Jt54Zi5IcF-hWqJr7yN4uLlKpG0elqLKFxk90i6jJK6ZITuYmxoN-H8OE2u0sZz2hrh6IqA8vrZgsKTP3_gpfXsyT_2naO77N7ES_C20HBD9gN1z1kv8dkNnSRHFr4Qrnc-BzijkDFzc7h5dG07T-Oxq9g1kGkloZRvBEF32f9FBD5QcSg8HkVWYMxCvcaTvU6WIvvUa7hL3jvfhKWJIvB754uCLgvu0DICm1nIVyVGuJ_9NHRQGMBdMHpETs7_jA-Oklj5YXUiFL1qDhfZbZQ2vLKFT7zRmVK59wheEFEqSsvsF1apQTXskCFt5XO0DWpTcals_wx2-nmnXvKwObaUFxYCwQCUijtWyNzL7x26A3xKmEHK2U0JopJ1THOG3RPSGdN0FnC9tfDFsMPXB_wjjS57iTa7NAwv_zRxFnYiEwL6a3xtUckZpRq0Ri5zQttdWZtkbC9lR00cS5fNRvLS9iLdTfOQjpaaTs3X-IYRbw6CA1Vwp4MZrOWhEtiAOc8YWrLoLZE3e7pZtPA9E2hFS4yvvt_uZ6xOwViLzrkKvI9ttNfLt1zxEq93g8T4g8JDhjO priority: 102 providerName: ProQuest |
Title | Trialling a Large Language Model (ChatGPT) in General Practice With the Applied Knowledge Test: Observational Study Demonstrating Opportunities and Limitations in Primary Care |
URI | https://www.ncbi.nlm.nih.gov/pubmed/37083633 https://www.proquest.com/docview/2917890662 https://www.proquest.com/docview/2805031598 https://pubmed.ncbi.nlm.nih.gov/PMC10163403 https://doaj.org/article/40b47fdcf9f643c88aef93d12bdb0dd2 |
Volume | 9 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Nb9QwELVoKyEkhEB8NLRERuoBDoHEdmIHCSFaWiqg7Qrtit6i2I7ZlZZsu2Ql-qv4i8w42aipeuCSQ-zIjmdGfjMevyFkT8hU6lLLyGQmjoRKsigvbR6l0mrYHiqX-4P2k9PseCK-nKfXsgm7Bfx9q2uH9aQmy_mbP5dXH8Dg32MaMyjQW5Gleb5BtmAzkmibJx3C92EWcP1iXzKN8SxHc2J3yf3Bl4MdyRP334Y2byZNXtuFjh6SBx18pB9beT8id6r6Mfk7Ri3Ce-W0pN8wtRuebRiSYq2zOX11MC2bz6Pxazqracc0TUfdBSn6Y9ZMKQBB2kFS-nUdaKNjmNw7eqb72C18h6mHV_RT9QuhJSoQjHt2gUu4qj0_Ky1rS_3NqTYciIOOWlYLivednpDJ0eH44DjqCjFERqSqATm6LLZMacuzirnYGRUrnfAKsAwATJ05Ae-lVUpwLRnIv8x0DJ5KbmIuK8ufks16UVfbhNpEGwwTawG4QAqlXWlk4oTTFThHPAvI3loYhemmicUy5gV4KyizwsssIGHf7aL9gZsd9lGSfSOyaPsXi-XPojPKQsRaSGeNyx0AM6NUCbrJbcK01bG1LCC7az0o1ppZMHBwVY7E-QF52TeDUeJJS1lXixX0UUizA0hRBeRZqzb9TLhEQnDOA6IGCjWY6rClnk098TdGWriI-fP_GHiH3GOAx_DgiyW7ZLNZrqoXgJ8aHZINeS5DsrV_eDr6HvooROit5h-k5CCG |
linkProvider | Scholars Portal |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9NAEB6VVAIkhEC8DKUsUpHgYNX2brxrJIToi5SkaYRS0ZvxPkwiFSe0iVD_FPxFZvxISIW49eKDd22PNd_ufjO7MwOwJWRb6kxL38Qm8IUKYz_JbOK3pdW4PLg8KTfaj_px50R8Om2frsGvJhaGjlU2c2I5UduJIR_5doR2hUooX_n76Q-fqkbR7mpTQqOCRddd_kST7eLd4R7q91UUHewPdzt-XVXAN6KtZihUHgc2Utry2EV5kBsVKB1yhwszsiUd5wLvS6uU4FpG-DNZrAOk3YkJuHSW43tvwLrgaMq0YH1nvz_4vPTqoKWJhOUm3KEz1ojubRG3y7yyy0WvrA3wL0J79VzmXwvdwT24WzNU9qGC1H1Yc8UD-D0koFLoOstYj06P47XydDIqp3bGXu-OstnHwfANGxesTmbNBnUMFvsyno0Yck1Ws17WbXx5bIjCvWXHeuEexufodOMl23Pfib0SRvG7x1MyFeZFmQKWZYVlZXBW5XGkjw6qxBmMQqoewsm1aOURtIpJ4Z4As6E25InWAqmHFErnmZFhLnLt0P7isQdbjTJSU4tJ9TjOUjSISGdpqTMPNhfdptUPXO2wQ5pcNFKi7vLG5PxbWo_7VARayNyaPMmR-xmlMoQ_t2GkrQ6sjTzYaHCQ1rPHRbrEugcvF8047mkzJyvcZI59FGXyQTKqPHhcwWYhCZeUc5xzD9QKoFZEXW0pxqMytzg5c7gI-NP_y_UCbnWGR720d9jvPoPbETI_2mKLwg1ozc7n7jkytZnerIcHg6_XPSL_AJhKVgs |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9NAEF6VIlVICIF4GUpZpCLBwYrtXXvXSAhBQ2hJaXNIRW6u90UiFSe0jlB_FXd-HTN-hVSIWy8-eNf2WDOz-83sPAjZ5SIWKlfC14kOfC7DxE9zk_qxMAq2B-vS6qD9y1Gyf8I_T-LJBvnd5sJgWGW7JlYLtZlr9JH3IrArZIr1ynuuCYsY9QfvFj987CCFJ61tO41aRIb28ieYbxdvD_rA65dRNPg43tv3mw4DvuaxLIFAlwQmksqwxEYucFoGUoXMwiYNyEkljsN9YaTkTIkIfixPVAAQPNUBE9YweO8NclOwOEQdExOx8u-AzQnQZYvcxmhrkPMeT-Kqwuxq-6u6BPwL2l6N0PxryxvcJXcarErf18J1j2zY4j75NUaRxSR2mtNDjCOHa-3zpNhY7Yy-2pvm5afR-DWdFbQpa01HTTYW_TorpxRQJ23wLx22Xj06BuLe0GPVOYrhOYxzvKR9-x1xLEorfPd4gUbDsqiKwdK8MLRK06p9j_jRUV1Cg2Jy1QNyci08eUg2i3lhHxNqQqXRJ604gBDBpXK5FqHjTlmwxFjikd2WGZluyMTOHGcZmEbIs6zimUd2ummL-geuTviAnOwGsWR3dWN-_i1rVoCMB4oLZ7RLHaBALWUOisBMGCmjAmMij2y3cpA168hFtpJ6j7zohmEFwGOdvLDzJcyRWNMHYKn0yKNabDpKmMDq44x5RK4J1Bqp6yPFbFpVGUe3DuMBe_J_up6TLdDD7PDgaPiU3IoAAuJZWxRuk83yfGmfAWQr1U6lG5ScXrcy_gF2N1jb |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Trialling+a+Large+Language+Model+%28ChatGPT%29+in+General+Practice+With+the+Applied+Knowledge+Test%3A+Observational+Study+Demonstrating+Opportunities+and+Limitations+in+Primary+Care&rft.jtitle=JMIR+medical+education&rft.au=Thirunavukarasu%2C+Arun+James&rft.au=Hassan%2C+Refaat&rft.au=Mahmood%2C+Shathar&rft.au=Sanghera%2C+Rohan&rft.date=2023-04-21&rft.issn=2369-3762&rft.eissn=2369-3762&rft.volume=9&rft.spage=e46599&rft_id=info:doi/10.2196%2F46599&rft.externalDBID=NO_FULL_TEXT |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2369-3762&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2369-3762&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2369-3762&client=summon |