Large Language Models in Biochemistry Education: Comparative Evaluation of Performance

Recent advancements in artificial intelligence (AI), particularly in large language models (LLMs), have started a new era of innovation across various fields, with medicine at the forefront of this technological revolution. Many studies indicated that at the current level of development, LLMs can pa...

Full description

Saved in:
Bibliographic Details
Published inJMIR medical education Vol. 11; p. e67244
Main Authors Bolgova, Olena, Shypilova, Inna, Mavrych, Volodymyr
Format Journal Article
LanguageEnglish
Published Canada JMIR Publications 10.04.2025
Subjects
Online AccessGet full text
ISSN2369-3762
2369-3762
DOI10.2196/67244

Cover

Abstract Recent advancements in artificial intelligence (AI), particularly in large language models (LLMs), have started a new era of innovation across various fields, with medicine at the forefront of this technological revolution. Many studies indicated that at the current level of development, LLMs can pass different board exams. However, the ability to answer specific subject-related questions requires validation. The objective of this study was to conduct a comprehensive analysis comparing the performance of advanced LLM chatbots-Claude (Anthropic), GPT-4 (OpenAI), Gemini (Google), and Copilot (Microsoft)-against the academic results of medical students in the medical biochemistry course. We used 200 USMLE (United States Medical Licensing Examination)-style multiple-choice questions (MCQs) selected from the course exam database. They encompassed various complexity levels and were distributed across 23 distinctive topics. The questions with tables and images were not included in the study. The results of 5 successive attempts by Claude 3.5 Sonnet, GPT-4-1106, Gemini 1.5 Flash, and Copilot to answer this questionnaire set were evaluated based on accuracy in August 2024. Statistica 13.5.0.17 (TIBCO Software Inc) was used to analyze the data's basic statistics. Considering the binary nature of the data, the chi-square test was used to compare results among the different chatbots, with a statistical significance level of P<.05. On average, the selected chatbots correctly answered 81.1% (SD 12.8%) of the questions, surpassing the students' performance by 8.3% (P=.02). In this study, Claude showed the best performance in biochemistry MCQs, correctly answering 92.5% (185/200) of questions, followed by GPT-4 (170/200, 85%), Gemini (157/200, 78.5%), and Copilot (128/200, 64%). The chatbots demonstrated the best results in the following 4 topics: eicosanoids (mean 100%, SD 0%), bioenergetics and electron transport chain (mean 96.4%, SD 7.2%), hexose monophosphate pathway (mean 91.7%, SD 16.7%), and ketone bodies (mean 93.8%, SD 12.5%). The Pearson chi-square test indicated a statistically significant association between the answers of all 4 chatbots (P<.001 to P<.04). Our study suggests that different AI models may have unique strengths in specific medical fields, which could be leveraged for targeted support in biochemistry courses. This performance highlights the potential of AI in medical education and assessment.
AbstractList Recent advancements in artificial intelligence (AI), particularly in large language models (LLMs), have started a new era of innovation across various fields, with medicine at the forefront of this technological revolution. Many studies indicated that at the current level of development, LLMs can pass different board exams. However, the ability to answer specific subject-related questions requires validation. The objective of this study was to conduct a comprehensive analysis comparing the performance of advanced LLM chatbots-Claude (Anthropic), GPT-4 (OpenAI), Gemini (Google), and Copilot (Microsoft)-against the academic results of medical students in the medical biochemistry course. We used 200 USMLE (United States Medical Licensing Examination)-style multiple-choice questions (MCQs) selected from the course exam database. They encompassed various complexity levels and were distributed across 23 distinctive topics. The questions with tables and images were not included in the study. The results of 5 successive attempts by Claude 3.5 Sonnet, GPT-4-1106, Gemini 1.5 Flash, and Copilot to answer this questionnaire set were evaluated based on accuracy in August 2024. Statistica 13.5.0.17 (TIBCO Software Inc) was used to analyze the data's basic statistics. Considering the binary nature of the data, the chi-square test was used to compare results among the different chatbots, with a statistical significance level of P<.05. On average, the selected chatbots correctly answered 81.1% (SD 12.8%) of the questions, surpassing the students' performance by 8.3% (P=.02). In this study, Claude showed the best performance in biochemistry MCQs, correctly answering 92.5% (185/200) of questions, followed by GPT-4 (170/200, 85%), Gemini (157/200, 78.5%), and Copilot (128/200, 64%). The chatbots demonstrated the best results in the following 4 topics: eicosanoids (mean 100%, SD 0%), bioenergetics and electron transport chain (mean 96.4%, SD 7.2%), hexose monophosphate pathway (mean 91.7%, SD 16.7%), and ketone bodies (mean 93.8%, SD 12.5%). The Pearson chi-square test indicated a statistically significant association between the answers of all 4 chatbots (P<.001 to P<.04). Our study suggests that different AI models may have unique strengths in specific medical fields, which could be leveraged for targeted support in biochemistry courses. This performance highlights the potential of AI in medical education and assessment.
Recent advancements in artificial intelligence (AI), particularly in large language models (LLMs), have started a new era of innovation across various fields, with medicine at the forefront of this technological revolution. Many studies indicated that at the current level of development, LLMs can pass different board exams. However, the ability to answer specific subject-related questions requires validation.BackgroundRecent advancements in artificial intelligence (AI), particularly in large language models (LLMs), have started a new era of innovation across various fields, with medicine at the forefront of this technological revolution. Many studies indicated that at the current level of development, LLMs can pass different board exams. However, the ability to answer specific subject-related questions requires validation.The objective of this study was to conduct a comprehensive analysis comparing the performance of advanced LLM chatbots-Claude (Anthropic), GPT-4 (OpenAI), Gemini (Google), and Copilot (Microsoft)-against the academic results of medical students in the medical biochemistry course.ObjectiveThe objective of this study was to conduct a comprehensive analysis comparing the performance of advanced LLM chatbots-Claude (Anthropic), GPT-4 (OpenAI), Gemini (Google), and Copilot (Microsoft)-against the academic results of medical students in the medical biochemistry course.We used 200 USMLE (United States Medical Licensing Examination)-style multiple-choice questions (MCQs) selected from the course exam database. They encompassed various complexity levels and were distributed across 23 distinctive topics. The questions with tables and images were not included in the study. The results of 5 successive attempts by Claude 3.5 Sonnet, GPT-4-1106, Gemini 1.5 Flash, and Copilot to answer this questionnaire set were evaluated based on accuracy in August 2024. Statistica 13.5.0.17 (TIBCO Software Inc) was used to analyze the data's basic statistics. Considering the binary nature of the data, the chi-square test was used to compare results among the different chatbots, with a statistical significance level of P<.05.MethodsWe used 200 USMLE (United States Medical Licensing Examination)-style multiple-choice questions (MCQs) selected from the course exam database. They encompassed various complexity levels and were distributed across 23 distinctive topics. The questions with tables and images were not included in the study. The results of 5 successive attempts by Claude 3.5 Sonnet, GPT-4-1106, Gemini 1.5 Flash, and Copilot to answer this questionnaire set were evaluated based on accuracy in August 2024. Statistica 13.5.0.17 (TIBCO Software Inc) was used to analyze the data's basic statistics. Considering the binary nature of the data, the chi-square test was used to compare results among the different chatbots, with a statistical significance level of P<.05.On average, the selected chatbots correctly answered 81.1% (SD 12.8%) of the questions, surpassing the students' performance by 8.3% (P=.02). In this study, Claude showed the best performance in biochemistry MCQs, correctly answering 92.5% (185/200) of questions, followed by GPT-4 (170/200, 85%), Gemini (157/200, 78.5%), and Copilot (128/200, 64%). The chatbots demonstrated the best results in the following 4 topics: eicosanoids (mean 100%, SD 0%), bioenergetics and electron transport chain (mean 96.4%, SD 7.2%), hexose monophosphate pathway (mean 91.7%, SD 16.7%), and ketone bodies (mean 93.8%, SD 12.5%). The Pearson chi-square test indicated a statistically significant association between the answers of all 4 chatbots (P<.001 to P<.04).ResultsOn average, the selected chatbots correctly answered 81.1% (SD 12.8%) of the questions, surpassing the students' performance by 8.3% (P=.02). In this study, Claude showed the best performance in biochemistry MCQs, correctly answering 92.5% (185/200) of questions, followed by GPT-4 (170/200, 85%), Gemini (157/200, 78.5%), and Copilot (128/200, 64%). The chatbots demonstrated the best results in the following 4 topics: eicosanoids (mean 100%, SD 0%), bioenergetics and electron transport chain (mean 96.4%, SD 7.2%), hexose monophosphate pathway (mean 91.7%, SD 16.7%), and ketone bodies (mean 93.8%, SD 12.5%). The Pearson chi-square test indicated a statistically significant association between the answers of all 4 chatbots (P<.001 to P<.04).Our study suggests that different AI models may have unique strengths in specific medical fields, which could be leveraged for targeted support in biochemistry courses. This performance highlights the potential of AI in medical education and assessment.ConclusionsOur study suggests that different AI models may have unique strengths in specific medical fields, which could be leveraged for targeted support in biochemistry courses. This performance highlights the potential of AI in medical education and assessment.
Abstract BackgroundRecent advancements in artificial intelligence (AI), particularly in large language models (LLMs), have started a new era of innovation across various fields, with medicine at the forefront of this technological revolution. Many studies indicated that at the current level of development, LLMs can pass different board exams. However, the ability to answer specific subject-related questions requires validation. ObjectiveThe objective of this study was to conduct a comprehensive analysis comparing the performance of advanced LLM chatbots—Claude (Anthropic), GPT-4 (OpenAI), Gemini (Google), and Copilot (Microsoft)—against the academic results of medical students in the medical biochemistry course. MethodsWe used 200 USMLE (United States Medical Licensing Examination)–style multiple-choice questions (MCQs) selected from the course exam database. They encompassed various complexity levels and were distributed across 23 distinctive topics. The questions with tables and images were not included in the study. The results of 5 successive attempts by Claude 3.5 Sonnet, GPT-4‐1106, Gemini 1.5 Flash, and Copilot to answer this questionnaire set were evaluated based on accuracy in August 2024. Statistica 13.5.0.17 (TIBCO Software Inc) was used to analyze the data’s basic statistics. Considering the binary nature of the data, the chi-square test was used to compare results among the different chatbots, with a statistical significance level of P. ResultsOn average, the selected chatbots correctly answered 81.1% (SD 12.8%) of the questions, surpassing the students’ performance by 8.3% (P=PP ConclusionsOur study suggests that different AI models may have unique strengths in specific medical fields, which could be leveraged for targeted support in biochemistry courses. This performance highlights the potential of AI in medical education and assessment.
Author Bolgova, Olena
Shypilova, Inna
Mavrych, Volodymyr
Author_xml – sequence: 1
  givenname: Olena
  orcidid: 0009-0002-9496-9754
  surname: Bolgova
  fullname: Bolgova, Olena
– sequence: 2
  givenname: Inna
  orcidid: 0009-0000-0707-6997
  surname: Shypilova
  fullname: Shypilova, Inna
– sequence: 3
  givenname: Volodymyr
  orcidid: 0009-0009-1159-4573
  surname: Mavrych
  fullname: Mavrych, Volodymyr
BackLink https://www.ncbi.nlm.nih.gov/pubmed/40209205$$D View this record in MEDLINE/PubMed
BookMark eNpVkV9P2zAUxS3ENFjXr4DyMomXbv4XJ9nLBFVhSJ3GA9urdWNfl6DE7uykEt8erwUETz4-9-h3LZ9P5NgHj4TMGf3KWaO-qYpLeUROuVDNQlSKH7_RJ2Se0gOllFWS07L5SE4k5bTJ-pT8XUPcYLEGv5kgi1_BYp-KzheXXTD3OHRpjI_Fyk4Gxi7478UyDFuI-bLDYrWDftr7RXDFLUYX4gDe4GfywUGfcP58zsifq9Xd8udi_fv6ZnmxXhjRqHEh64YZUNZZgLKqWsEBsWacZrPi3IpSIXctsw4kcxxLlFbSCgFt21inxIzcHLg2wIPexm6A-KgDdHpvhLjREMfO9KgRhVUoRcY4yQDasuUMauQ12v1gRn4cWNupHdAa9GOE_h30_cR393oTdjq_l5aK0kw4fybE8G_CNOr8fQb7HjyGKWnB6rpmpSzrHD17u-x1y0szOfDlEDAxpBTRvUYY1f9L1_vSxROGfp9Q
Cites_doi 10.3389/fmed.2023.1240915
10.2196/46482
10.7759/cureus.37023
10.1097/ACM.0000000000005626
10.2196/57594
10.24018/ejmed.2023.5.6.1989
10.2196/60807
10.1038/s41598-023-43436-9
10.1038/s41586-023-06455-0
10.1002/ca.24244
10.5694/mja2.52061
10.1371/journal.pdig.0000198
10.1016/j.ebiom.2019.07.019
10.1098/rsos.230658
10.7759/cureus.46222
10.3352/jeehp.2023.20.30
10.3390/healthcare11142046
10.7759/cureus.55991
10.1016/j.amjmed.2020.03.033
10.2147/AMEP.S457408
10.7759/cureus.42527
10.1007/s11596-021-2474-3
10.15406/mojap.2023.10.00339
10.1080/10872981.2023.2220920
10.1002/bmb.21808
ContentType Journal Article
Copyright Olena Bolgova, Inna Shypilova, Volodymyr Mavrych. Originally published in JMIR Medical Education (https://mededu.jmir.org).
Copyright © Olena Bolgova, Inna Shypilova, Volodymyr Mavrych. Originally published in JMIR Medical Education (https://mededu.jmir.org) 2025
Copyright_xml – notice: Olena Bolgova, Inna Shypilova, Volodymyr Mavrych. Originally published in JMIR Medical Education (https://mededu.jmir.org).
– notice: Copyright © Olena Bolgova, Inna Shypilova, Volodymyr Mavrych. Originally published in JMIR Medical Education (https://mededu.jmir.org) 2025
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7X8
5PM
DOA
DOI 10.2196/67244
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE
MEDLINE - Academic

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
EISSN 2369-3762
EndPage e67244
ExternalDocumentID oai_doaj_org_article_ee3d6e43f2ef41aab5b21a8e28edd6e4
PMC12005600
40209205
10_2196_67244
Genre Journal Article
Comparative Study
GeographicLocations United States
GeographicLocations_xml – name: United States
GroupedDBID 7X7
8FI
8FJ
AAFWJ
AAHSB
AAYXX
ABUWG
ADBBV
AFKRA
AFPKN
ALIPV
ALMA_UNASSIGNED_HOLDINGS
AOIJS
BCNDV
BENPR
CCPQU
CITATION
FYUFA
GROUPED_DOAJ
HMCUK
HYE
KQ8
M~E
OK1
PGMZT
PHGZM
PHGZT
PIMPY
RPM
UKHRP
CGR
CUY
CVF
ECM
EIF
NPM
7X8
PUEGO
5PM
ID FETCH-LOGICAL-c396t-4891ca6dfdaa577b32aee8120ca6722d356e2fb1dfa41f2e5e4d407eaedb9df63
IEDL.DBID DOA
ISSN 2369-3762
IngestDate Wed Aug 27 01:23:28 EDT 2025
Thu Aug 21 18:29:26 EDT 2025
Fri Sep 05 17:40:10 EDT 2025
Sat May 24 01:34:04 EDT 2025
Tue Aug 05 12:08:58 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords questionnaire
medical education
natural language processing
medical students
AI
Claude
Copilot
LLM
large language model
machine learning
artificial intelligence
NLP
bioenergetics
ChatGPT
biochemistry
comprehensive analysis
medical course
GPT-4
Gemini
ML
Language English
License Olena Bolgova, Inna Shypilova, Volodymyr Mavrych. Originally published in JMIR Medical Education (https://mededu.jmir.org).
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Education, is properly cited. The complete bibliographic information, a link to the original publication on https://mededu.jmir.org/, as well as this copyright and license information must be included.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c396t-4891ca6dfdaa577b32aee8120ca6722d356e2fb1dfa41f2e5e4d407eaedb9df63
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
None declared.
ORCID 0009-0002-9496-9754
0009-0000-0707-6997
0009-0009-1159-4573
OpenAccessLink https://doaj.org/article/ee3d6e43f2ef41aab5b21a8e28edd6e4
PMID 40209205
PQID 3188815458
PQPubID 23479
ParticipantIDs doaj_primary_oai_doaj_org_article_ee3d6e43f2ef41aab5b21a8e28edd6e4
pubmedcentral_primary_oai_pubmedcentral_nih_gov_12005600
proquest_miscellaneous_3188815458
pubmed_primary_40209205
crossref_primary_10_2196_67244
PublicationCentury 2000
PublicationDate 20250410
PublicationDateYYYYMMDD 2025-04-10
PublicationDate_xml – month: 4
  year: 2025
  text: 20250410
  day: 10
PublicationDecade 2020
PublicationPlace Canada
PublicationPlace_xml – name: Canada
– name: Toronto, Canada
PublicationTitle JMIR medical education
PublicationTitleAlternate JMIR Med Educ
PublicationYear 2025
Publisher JMIR Publications
Publisher_xml – name: JMIR Publications
References Laupichler (R14); 99
Brin (R17); 13
Garcia-Vidal (R2); 46
Goyal (R19); 15
Singhal (R4); 620
Mavrych (R16); 10
Kung (R24); 2
Gilson (R21); 10
Kleinig (R26); 219
Bharatha (R18); 15
Ghosh (R29); 15
Mavrych (R13); 38
Abbas (R27); 16
Torres-Zegarra (R20); 20
Liu (R15); 26
Lai (R25); 10
Bolgova (R6); 5
Meo (R5); 11
R9
Agarwal (R8); 15
Surapaneni (R28); 52
R10
Lin (R23); 10
R12
R11
Roos (R7); 9
Friederichs (R22); 28
Ellahham (R3); 133
Liu (R1); 41
References_xml – volume: 10
  ident: R25
  article-title: Evaluating the performance of ChatGPT-4 on the United Kingdom Medical Licensing Assessment
  publication-title: Front Med (Lausanne)
  doi: 10.3389/fmed.2023.1240915
– volume: 9
  ident: R7
  article-title: Artificial intelligence in medical education: comparative analysis of ChatGPT, Bing, and medical students in Germany
  publication-title: JMIR Med Educ
  doi: 10.2196/46482
– volume: 15
  issue: 4
  ident: R29
  article-title: Evaluating ChatGPT’s ability to solve higher-order questions on the competency-based medical education curriculum in medical biochemistry
  publication-title: Cureus
  doi: 10.7759/cureus.37023
– volume: 99
  start-page: 508
  issue: 5
  ident: R14
  article-title: Large language models in medical education: comparing ChatGPT- to human-generated exam questions
  publication-title: Acad Med
  doi: 10.1097/ACM.0000000000005626
– volume: 10
  ident: R21
  article-title: Correction: How does ChatGPT perform on the United States Medical Licensing Examination (USMLE)? The implications of large language models for medical education and knowledge assessment
  publication-title: JMIR Med Educ
  doi: 10.2196/57594
– volume: 5
  start-page: 94
  issue: 6
  ident: R6
  article-title: How well did ChatGPT perform in answering questions on different topics in gross anatomy?
  publication-title: Eur J Med Health Sci
  doi: 10.24018/ejmed.2023.5.6.1989
– volume: 26
  ident: R15
  article-title: Performance of ChatGPT across different versions in medical licensing examinations worldwide: systematic review and meta-analysis
  publication-title: J Med Internet Res
  doi: 10.2196/60807
– volume: 13
  issue: 1
  ident: R17
  article-title: Comparing ChatGPT and GPT-4 performance in USMLE soft skill assessments
  publication-title: Sci Rep
  doi: 10.1038/s41598-023-43436-9
– ident: R10
– volume: 620
  issue: 7973
  ident: R4
  article-title: Publisher correction: large language models encode clinical knowledge
  publication-title: Nature New Biol
  doi: 10.1038/s41586-023-06455-0
– volume: 38
  start-page: 200
  issue: 2
  ident: R13
  article-title: Using large language models (ChatGPT, Copilot, PaLM, Bard, and Gemini) in gross anatomy course: comparative analysis
  publication-title: Clin Anat
  doi: 10.1002/ca.24244
– volume: 219
  issue: 5
  ident: R26
  article-title: This too shall pass: the performance of ChatGPT-3.5, ChatGPT-4 and New Bing in an Australian medical licensing examination
  publication-title: Med J Aust
  doi: 10.5694/mja2.52061
– ident: R9
– ident: R12
– volume: 2
  issue: 2
  ident: R24
  article-title: Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models
  publication-title: PLOS Digit Health
  doi: 10.1371/journal.pdig.0000198
– volume: 46
  start-page: 27
  issue: 27-29
  ident: R2
  article-title: Artificial intelligence to support clinical decision-making processes
  publication-title: EBioMedicine
  doi: 10.1016/j.ebiom.2019.07.019
– volume: 10
  issue: 8
  ident: R23
  article-title: Why and how to embrace AI such as ChatGPT in your academic life
  publication-title: R Soc Open Sci
  doi: 10.1098/rsos.230658
– volume: 15
  issue: 9
  ident: R8
  article-title: Evaluating ChatGPT-3.5 and Claude-2 in answering and explaining conceptual medical physiology multiple-choice questions
  publication-title: Cureus
  doi: 10.7759/cureus.46222
– volume: 20
  ident: R20
  article-title: Performance of ChatGPT, Bard, Claude, and Bing on the Peruvian National Licensing Medical Examination: a cross-sectional study
  publication-title: J Educ Eval Health Prof
  doi: 10.3352/jeehp.2023.20.30
– volume: 11
  issue: 14
  ident: R5
  article-title: ChatGPT knowledge evaluation in basic and clinical medical sciences: multiple choice question examination-based performance
  publication-title: Healthcare (Basel)
  doi: 10.3390/healthcare11142046
– volume: 16
  issue: 3
  ident: R27
  article-title: Comparing the performance of popular large language models on the National Board of Medical Examiners sample questions
  publication-title: Cureus
  doi: 10.7759/cureus.55991
– volume: 133
  start-page: 895
  issue: 8
  ident: R3
  article-title: Artificial intelligence: the future for diabetes care
  publication-title: Am J Med
  doi: 10.1016/j.amjmed.2020.03.033
– volume: 15
  start-page: 393
  issue: 393-400
  ident: R18
  article-title: Comparing the performance of ChatGPT-4 and medical students on MCQs at varied levels of Bloom’s taxonomy
  publication-title: Adv Med Educ Pract
  doi: 10.2147/AMEP.S457408
– volume: 15
  issue: 7
  ident: R19
  article-title: Interactive learning: online audience response system and multiple choice questions improve student participation in lectures
  publication-title: Cureus
  doi: 10.7759/cureus.42527
– volume: 41
  start-page: 1105
  issue: 6
  ident: R1
  article-title: Application of artificial intelligence in medicine: an overview
  publication-title: Curr Med Sci
  doi: 10.1007/s11596-021-2474-3
– volume: 10
  start-page: 55
  issue: 1
  ident: R16
  article-title: Evaluating AI performance in answering questions related to thoracic anatomy
  publication-title: MOJ Anat Physiol
  doi: 10.15406/mojap.2023.10.00339
– volume: 28
  issue: 1
  ident: R22
  article-title: ChatGPT in medical school: how successful is AI in progress testing?
  publication-title: Med Educ Online
  doi: 10.1080/10872981.2023.2220920
– volume: 52
  start-page: 237
  issue: 2
  ident: R28
  article-title: Evaluating ChatGPT as a self-learning tool in medical biochemistry: a performance assessment in undergraduate medical university examination
  publication-title: Biochem Mol Biol Educ
  doi: 10.1002/bmb.21808
– ident: R11
SSID ssj0001742059
Score 2.3076382
Snippet Recent advancements in artificial intelligence (AI), particularly in large language models (LLMs), have started a new era of innovation across various fields,...
Abstract BackgroundRecent advancements in artificial intelligence (AI), particularly in large language models (LLMs), have started a new era of innovation...
SourceID doaj
pubmedcentral
proquest
pubmed
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
StartPage e67244
SubjectTerms Artificial Intelligence
Biochemistry - education
Chatbots and Conversational Agents
e-Learning and Digital Medical Education
Educational Measurement - methods
Humans
Large Language Models
Machine Learning
New Methods and Approaches in Medical Education
New Resources for Medical Education
Original Paper
Students, Medical - statistics & numerical data
Surveys and Questionnaires
Testing and Assessment in Medical Education
Theme Issue: ChatGPT and Generative Language Models in Medical Education
United States
Title Large Language Models in Biochemistry Education: Comparative Evaluation of Performance
URI https://www.ncbi.nlm.nih.gov/pubmed/40209205
https://www.proquest.com/docview/3188815458
https://pubmed.ncbi.nlm.nih.gov/PMC12005600
https://doaj.org/article/ee3d6e43f2ef41aab5b21a8e28edd6e4
Volume 11
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3fSxwxEB7aK5SCiNJW1x9HCr4uZpNssuubJyciehxFy70tyWWCB2Wv6N3_72T3fkrBF1-ThwzfFzIzyeQbgDNjrCTXgynnXqbKu3FqDbdpEOQrSu5c0PHv8P1A3zyq21E-2mj1FWvCWnngFrhzROk1KhkEBpVZ63InMlugKNDHiXj68pJvJFPN7QplfBQ4fIWdWOtMu-xcG3JkW86n0ej_X2D5tj5yw-Fc78HuIlJkl62F-_AJ6-_w5y5WbrO7xS0ji63M_r6wSc16k9j7qm3exlZlGxfsai3vzforaW82DWy4_jPwAx6v-w9XN-miNUI6lqWepaoos7HVPnhrc2OcFBaRfDWnQSOEl7lGEVzmg1UZQZej8pS6oUXvSh-0_AmdelrjIbBMF4rSRKmCE2osbJkL4yzykBONXogEukvMqn-tAkZFmUMEtWpATaAXkVxNRsHqZoBorBY0Vu_RmMCvJQ8VQRVfLWyN0_lLRYdOUTTvewkctLyslorJb0lEJ1BsMbZly_ZMPXlqRLSzeJ1G0d7RR1h_DN9E7AscNSD5CXRmz3M8pWBl5rrw2YxMF770-oPh726zS18BFrbv9A
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Large+Language+Models+in+Biochemistry+Education%3A+Comparative+Evaluation+of+Performance&rft.jtitle=JMIR+medical+education&rft.au=Bolgova%2C+Olena&rft.au=Shypilova%2C+Inna&rft.au=Mavrych%2C+Volodymyr&rft.date=2025-04-10&rft.pub=JMIR+Publications&rft.eissn=2369-3762&rft.volume=11&rft_id=info:doi/10.2196%2F67244&rft_id=info%3Apmid%2F40209205&rft.externalDocID=PMC12005600
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2369-3762&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2369-3762&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2369-3762&client=summon