A framework for evaluating the chemical knowledge and reasoning abilities of large language models against the expertise of chemists
Large language models (LLMs) have gained widespread interest owing to their ability to process human language and perform tasks on which they have not been explicitly trained. However, we possess only a limited systematic understanding of the chemical capabilities of LLMs, which would be required to...
Saved in:
Published in | Nature chemistry Vol. 17; no. 7; pp. 1027 - 1034 |
---|---|
Main Authors | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
London
Nature Publishing Group UK
01.07.2025
Nature Publishing Group |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Large language models (LLMs) have gained widespread interest owing to their ability to process human language and perform tasks on which they have not been explicitly trained. However, we possess only a limited systematic understanding of the chemical capabilities of LLMs, which would be required to improve models and mitigate potential harm. Here we introduce ChemBench, an automated framework for evaluating the chemical knowledge and reasoning abilities of state-of-the-art LLMs against the expertise of chemists. We curated more than 2,700 question–answer pairs, evaluated leading open- and closed-source LLMs and found that the best models, on average, outperformed the best human chemists in our study. However, the models struggle with some basic tasks and provide overconfident predictions. These findings reveal LLMs’ impressive chemical capabilities while emphasizing the need for further research to improve their safety and usefulness. They also suggest adapting chemistry education and show the value of benchmarking frameworks for evaluating LLMs in specific domains.
Large language models are increasingly used for diverse tasks, yet we have limited insight into their understanding of chemistry. Now ChemBench—a benchmarking framework containing more than 2,700 question–answer pairs—has been developed to assess their chemical knowledge and reasoning, revealing that the best models surpass human chemists on average but struggle with some basic tasks. |
---|---|
AbstractList | Large language models (LLMs) have gained widespread interest owing to their ability to process human language and perform tasks on which they have not been explicitly trained. However, we possess only a limited systematic understanding of the chemical capabilities of LLMs, which would be required to improve models and mitigate potential harm. Here we introduce ChemBench, an automated framework for evaluating the chemical knowledge and reasoning abilities of state-of-the-art LLMs against the expertise of chemists. We curated more than 2,700 question–answer pairs, evaluated leading open- and closed-source LLMs and found that the best models, on average, outperformed the best human chemists in our study. However, the models struggle with some basic tasks and provide overconfident predictions. These findings reveal LLMs’ impressive chemical capabilities while emphasizing the need for further research to improve their safety and usefulness. They also suggest adapting chemistry education and show the value of benchmarking frameworks for evaluating LLMs in specific domains.
Large language models are increasingly used for diverse tasks, yet we have limited insight into their understanding of chemistry. Now ChemBench—a benchmarking framework containing more than 2,700 question–answer pairs—has been developed to assess their chemical knowledge and reasoning, revealing that the best models surpass human chemists on average but struggle with some basic tasks. Large language models (LLMs) have gained widespread interest owing to their ability to process human language and perform tasks on which they have not been explicitly trained. However, we possess only a limited systematic understanding of the chemical capabilities of LLMs, which would be required to improve models and mitigate potential harm. Here we introduce ChemBench, an automated framework for evaluating the chemical knowledge and reasoning abilities of state-of-the-art LLMs against the expertise of chemists. We curated more than 2,700 question-answer pairs, evaluated leading open- and closed-source LLMs and found that the best models, on average, outperformed the best human chemists in our study. However, the models struggle with some basic tasks and provide overconfident predictions. These findings reveal LLMs' impressive chemical capabilities while emphasizing the need for further research to improve their safety and usefulness. They also suggest adapting chemistry education and show the value of benchmarking frameworks for evaluating LLMs in specific domains.Large language models (LLMs) have gained widespread interest owing to their ability to process human language and perform tasks on which they have not been explicitly trained. However, we possess only a limited systematic understanding of the chemical capabilities of LLMs, which would be required to improve models and mitigate potential harm. Here we introduce ChemBench, an automated framework for evaluating the chemical knowledge and reasoning abilities of state-of-the-art LLMs against the expertise of chemists. We curated more than 2,700 question-answer pairs, evaluated leading open- and closed-source LLMs and found that the best models, on average, outperformed the best human chemists in our study. However, the models struggle with some basic tasks and provide overconfident predictions. These findings reveal LLMs' impressive chemical capabilities while emphasizing the need for further research to improve their safety and usefulness. They also suggest adapting chemistry education and show the value of benchmarking frameworks for evaluating LLMs in specific domains. Large language models (LLMs) have gained widespread interest owing to their ability to process human language and perform tasks on which they have not been explicitly trained. However, we possess only a limited systematic understanding of the chemical capabilities of LLMs, which would be required to improve models and mitigate potential harm. Here we introduce ChemBench, an automated framework for evaluating the chemical knowledge and reasoning abilities of state-of-the-art LLMs against the expertise of chemists. We curated more than 2,700 question-answer pairs, evaluated leading open- and closed-source LLMs and found that the best models, on average, outperformed the best human chemists in our study. However, the models struggle with some basic tasks and provide overconfident predictions. These findings reveal LLMs' impressive chemical capabilities while emphasizing the need for further research to improve their safety and usefulness. They also suggest adapting chemistry education and show the value of benchmarking frameworks for evaluating LLMs in specific domains. Large language models (LLMs) have gained widespread interest owing to their ability to process human language and perform tasks on which they have not been explicitly trained. However, we possess only a limited systematic understanding of the chemical capabilities of LLMs, which would be required to improve models and mitigate potential harm. Here we introduce ChemBench, an automated framework for evaluating the chemical knowledge and reasoning abilities of state-of-the-art LLMs against the expertise of chemists. We curated more than 2,700 question–answer pairs, evaluated leading open- and closed-source LLMs and found that the best models, on average, outperformed the best human chemists in our study. However, the models struggle with some basic tasks and provide overconfident predictions. These findings reveal LLMs’ impressive chemical capabilities while emphasizing the need for further research to improve their safety and usefulness. They also suggest adapting chemistry education and show the value of benchmarking frameworks for evaluating LLMs in specific domains.Large language models are increasingly used for diverse tasks, yet we have limited insight into their understanding of chemistry. Now ChemBench—a benchmarking framework containing more than 2,700 question–answer pairs—has been developed to assess their chemical knowledge and reasoning, revealing that the best models surpass human chemists on average but struggle with some basic tasks. |
Author | Gil, María Victoria Ringleb, Michael Schubert, Ulrich S. Asgari, Mehrdad Jablonka, Kevin Maik Eberhardt, Juliane Mirza, Adrian Pieler, Michael Emoekabu, Benedict Schilling-Wilhelmi, Mara Schreiber, Johanna Glaubitz, Christina Roesner, Nicole C. Stafast, Leanne M. Krishnan, Aswanth Kunchapu, Sreekanth Alampara, Nawaf Gupta, Tanya Ríos-García, Martiño Okereke, Macjonathan Greiner, Maximilian Kreth, Fabian Alexander Holick, Caroline T. Elahi, Amir Mohammad Meyer, Jakob Miret, Santiago Elbeheiry, Hani M. Aneesh, Anagha Klepsch, Lea C. Peschel, Jan Matthias Wonanke, A. D. Dinga Ibrahim, Abdelrahman Schwaller, Philippe Hoffmann, Tim Köster, Yannik |
Author_xml | – sequence: 1 givenname: Adrian orcidid: 0000-0003-4033-4235 surname: Mirza fullname: Mirza, Adrian organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena, Helmholtz Institute for Polymers in Energy Applications Jena (HIPOLE Jena) – sequence: 2 givenname: Nawaf orcidid: 0009-0001-7697-7315 surname: Alampara fullname: Alampara, Nawaf organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena – sequence: 3 givenname: Sreekanth orcidid: 0009-0003-5752-0154 surname: Kunchapu fullname: Kunchapu, Sreekanth organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena – sequence: 4 givenname: Martiño orcidid: 0000-0003-1507-4048 surname: Ríos-García fullname: Ríos-García, Martiño organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena, Institute of Carbon Science and Technology, CSIC – sequence: 5 givenname: Benedict orcidid: 0009-0001-1860-8132 surname: Emoekabu fullname: Emoekabu, Benedict organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena – sequence: 6 givenname: Aswanth orcidid: 0009-0008-2703-5613 surname: Krishnan fullname: Krishnan, Aswanth organization: QpiVolta Technologies Pvt Ltd – sequence: 7 givenname: Tanya orcidid: 0009-0001-9523-3290 surname: Gupta fullname: Gupta, Tanya organization: Laboratory of Artificial Chemical Intelligence, Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne, National Centre of Competence in Research Catalysis, École Polytechnique Fédérale de Lausanne – sequence: 8 givenname: Mara orcidid: 0009-0007-4392-5918 surname: Schilling-Wilhelmi fullname: Schilling-Wilhelmi, Mara organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena – sequence: 9 givenname: Macjonathan orcidid: 0009-0007-1013-0502 surname: Okereke fullname: Okereke, Macjonathan organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena – sequence: 10 givenname: Anagha orcidid: 0009-0001-0275-2586 surname: Aneesh fullname: Aneesh, Anagha organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena – sequence: 11 givenname: Mehrdad orcidid: 0000-0002-5427-1610 surname: Asgari fullname: Asgari, Mehrdad organization: Department of Chemical Engineering and Biotechnology, University of Cambridge – sequence: 12 givenname: Juliane orcidid: 0009-0000-3991-0704 surname: Eberhardt fullname: Eberhardt, Juliane organization: Macromolecular Chemistry, University of Bayreuth – sequence: 13 givenname: Amir Mohammad orcidid: 0009-0001-5907-101X surname: Elahi fullname: Elahi, Amir Mohammad organization: Laboratory of Molecular Simulation, Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne – sequence: 14 givenname: Hani M. orcidid: 0000-0002-5205-2852 surname: Elbeheiry fullname: Elbeheiry, Hani M. organization: Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena – sequence: 15 givenname: María Victoria orcidid: 0000-0002-2258-3011 surname: Gil fullname: Gil, María Victoria organization: Institute of Carbon Science and Technology, CSIC – sequence: 16 givenname: Christina surname: Glaubitz fullname: Glaubitz, Christina – sequence: 17 givenname: Maximilian surname: Greiner fullname: Greiner, Maximilian organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena – sequence: 18 givenname: Caroline T. orcidid: 0009-0000-1724-2725 surname: Holick fullname: Holick, Caroline T. organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena, Jena Center for Soft Matter, Friedrich Schiller University Jena – sequence: 19 givenname: Tim orcidid: 0009-0004-0230-6115 surname: Hoffmann fullname: Hoffmann, Tim organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena, Jena Center for Soft Matter, Friedrich Schiller University Jena – sequence: 20 givenname: Abdelrahman orcidid: 0009-0003-1460-4710 surname: Ibrahim fullname: Ibrahim, Abdelrahman organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena – sequence: 21 givenname: Lea C. orcidid: 0009-0009-3849-1670 surname: Klepsch fullname: Klepsch, Lea C. organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena, Jena Center for Soft Matter, Friedrich Schiller University Jena – sequence: 22 givenname: Yannik orcidid: 0000-0002-9125-3067 surname: Köster fullname: Köster, Yannik organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena, Jena Center for Soft Matter, Friedrich Schiller University Jena – sequence: 23 givenname: Fabian Alexander orcidid: 0000-0002-5968-8706 surname: Kreth fullname: Kreth, Fabian Alexander organization: Institute for Technical Chemistry and Environmental Chemistry, Friedrich Schiller University Jena, Center for Energy and Environmental Chemistry Jena, Friedrich Schiller University Jena – sequence: 24 givenname: Jakob surname: Meyer fullname: Meyer, Jakob organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena – sequence: 25 givenname: Santiago orcidid: 0000-0002-5121-3853 surname: Miret fullname: Miret, Santiago organization: Intel Labs – sequence: 26 givenname: Jan Matthias orcidid: 0009-0002-4787-2757 surname: Peschel fullname: Peschel, Jan Matthias organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena – sequence: 27 givenname: Michael orcidid: 0000-0002-7320-8529 surname: Ringleb fullname: Ringleb, Michael organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena, Jena Center for Soft Matter, Friedrich Schiller University Jena – sequence: 28 givenname: Nicole C. orcidid: 0000-0002-5133-775X surname: Roesner fullname: Roesner, Nicole C. organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena, Jena Center for Soft Matter, Friedrich Schiller University Jena – sequence: 29 givenname: Johanna orcidid: 0009-0000-0991-8967 surname: Schreiber fullname: Schreiber, Johanna organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena, Jena Center for Soft Matter, Friedrich Schiller University Jena – sequence: 30 givenname: Ulrich S. orcidid: 0000-0003-4978-4670 surname: Schubert fullname: Schubert, Ulrich S. organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena, Helmholtz Institute for Polymers in Energy Applications Jena (HIPOLE Jena), Jena Center for Soft Matter, Friedrich Schiller University Jena, Center for Energy and Environmental Chemistry Jena, Friedrich Schiller University Jena – sequence: 31 givenname: Leanne M. orcidid: 0009-0008-5604-261X surname: Stafast fullname: Stafast, Leanne M. organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena, Jena Center for Soft Matter, Friedrich Schiller University Jena – sequence: 32 givenname: A. D. Dinga orcidid: 0000-0002-9066-2715 surname: Wonanke fullname: Wonanke, A. D. Dinga organization: Theoretical Chemistry, Technische Universität Dresden – sequence: 33 givenname: Michael orcidid: 0000-0001-9186-7045 surname: Pieler fullname: Pieler, Michael organization: OpenBioML.org, Stability.AI – sequence: 34 givenname: Philippe orcidid: 0000-0003-3046-6576 surname: Schwaller fullname: Schwaller, Philippe organization: Laboratory of Artificial Chemical Intelligence, Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne, National Centre of Competence in Research Catalysis, École Polytechnique Fédérale de Lausanne – sequence: 35 givenname: Kevin Maik orcidid: 0000-0003-4894-4660 surname: Jablonka fullname: Jablonka, Kevin Maik email: mail@kjablonka.com organization: Laboratory of Organic and Macromolecular Chemistry, Friedrich Schiller University Jena, Helmholtz Institute for Polymers in Energy Applications Jena (HIPOLE Jena), Jena Center for Soft Matter, Friedrich Schiller University Jena, Center for Energy and Environmental Chemistry Jena, Friedrich Schiller University Jena |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/40394186$$D View this record in MEDLINE/PubMed |
BookMark | eNp9kklvFDEQhS0URBb4AxyQJS5cGrx39wlFEZsUiQucLbe73OPEYw92dzLc-eF4ZsKwHDi5ZH_1qp78ztFJTBEQek7Ja0p496YIKmXbECYbQjsqm-0jdEZbKRvBRX9yrDk5Reel3BCiJKfqCToVhPeCduoM_bjELps13Kd8i13KGO5MWMzs44TnFWC7grW3JuDbmO4DjBNgE0ecwZQUd5AZfPCzh4KTw8HkCgQTp8XUYp1GCAWbyfhY5r0ebDeQZ19gh-_Fy1yeosfOhALPHs4L9PX9uy9XH5vrzx8-XV1eN1awbm5gIGxoW-GoU4aJjrajdc4q2xvK3NiKHiwoSYdWDn3PSMsJdNwx0_X1GgS_QG8PuptlWMNoIc7ZBL3Jfm3yd52M13-_RL_SU7rTlDGmOGdV4dWDQk7fFiizrgYshGoZ0lI0Z0QxroSUFX35D3qTlhyrv0ox1UnC2x314s-Vjrv8-qEKsANgcyolgzsilOhdDPQhBrrGQO9joLe1iR-aSoXjBPn37P90_QRQnbhL |
Cites_doi | 10.1039/D3DD00239J 10.48550/arXiv.2304.10510 10.18653/v1/2023.emnlp-main.468 10.48550/arXiv.2402.01439 10.26434/chemrxiv-2023-05v1b-v2 10.1038/s41524-020-00406-3 10.1038/s41586-023-06792-0 10.48550/arXiv.2311.15936 10.1371/journal.pdig.0000198 10.48550/arXiv.2312.07559 10.1093/bioinformatics/btae104 10.48550/arXiv.2303.12712 10.1039/D3DD00113J 10.1073/pnas.2322420121 10.1016/j.matt.2024.10.015 10.48550/arXiv.2108.07258 10.48550/arXiv.2303.08774 10.48550/arXiv.2403.05075 10.48550/arXiv.2304.05341 10.18653/v1/2023.acl-long.753 10.1038/s41467-024-45563-x 10.48550/arXiv.2409.13740 10.1038/s42256-024-00832-8 10.48550/arXiv.2402.05200 10.48550/arXiv.2305.05708 10.48550/arXiv.2209.07858 10.48550/arXiv.2406.17295 10.18653/v1/2023.acl-long.201 10.1039/D4CS00913D 10.48550/arXiv.2310.18233 10.1038/s42256-023-00740-3 10.48550/arXiv.2305.18365 10.1039/D3SC04610A 10.1039/D3DD00188A 10.1017/pan.2023.2 10.48550/arXiv.2311.07361 10.1038/s42256-022-00511-6 10.48550/arXiv.2211.09085 10.48550/arXiv.2402.06852 10.1038/s42256-023-00788-1 10.1038/s42256-022-00465-9 10.5281/zenodo.14010212 10.1039/C7SC02664A 10.48550/arXiv.2209.01712 10.1145/3442188.3445922 10.1016/j.ymeth.2024.01.004 10.1038/s41467-023-42242-1 10.48550/arXiv.2307.03718 10.48550/arXiv.2310.14029 10.48550/arXiv.2409.13989 10.48550/arXiv.2205.00445 10.48550/arXiv.2411.16736 10.1038/s41570-023-00502-0 10.18653/v1/2023.findings-emnlp.380 |
ContentType | Journal Article |
Copyright | The Author(s) 2025 2025. The Author(s). The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. The Author(s) 2025 2025 |
Copyright_xml | – notice: The Author(s) 2025 – notice: 2025. The Author(s). – notice: The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. – notice: The Author(s) 2025 2025 |
DBID | C6C AAYXX CITATION CGR CUY CVF ECM EIF NPM 7QR 8FD FR3 K9. P64 7X8 5PM |
DOI | 10.1038/s41557-025-01815-x |
DatabaseName | Springer Nature OA Free Journals CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Chemoreception Abstracts Technology Research Database Engineering Research Database ProQuest Health & Medical Complete (Alumni) Biotechnology and BioEngineering Abstracts MEDLINE - Academic PubMed Central (Full Participant titles) |
DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) ProQuest Health & Medical Complete (Alumni) Chemoreception Abstracts Engineering Research Database Technology Research Database Biotechnology and BioEngineering Abstracts MEDLINE - Academic |
DatabaseTitleList | MEDLINE - Academic MEDLINE CrossRef ProQuest Health & Medical Complete (Alumni) |
Database_xml | – sequence: 1 dbid: C6C name: Springer Nature Link Open Access Journals url: http://www.springeropen.com/ sourceTypes: Publisher – sequence: 2 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Chemistry |
EISSN | 1755-4349 |
EndPage | 1034 |
ExternalDocumentID | PMC12226332 40394186 10_1038_s41557_025_01815_x |
Genre | Journal Article |
GeographicLocations | United States--US |
GeographicLocations_xml | – name: United States--US |
GrantInformation_xml | – fundername: Ministry of Economy and Competitiveness | Agencia Estatal de Investigación (Spanish Agencia Estatal de Investigación) grantid: CNS2022-135474 funderid: https://doi.org/10.13039/501100011033 – fundername: Helmholtz Association funderid: https://doi.org/10.13039/501100009318 – fundername: Fulbright Association funderid: https://doi.org/10.13039/501100010629 – fundername: Deutsche Forschungsgemeinschaft (German Research Foundation) grantid: 497115849; 497115849 funderid: https://doi.org/10.13039/501100001659 – fundername: Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (Swiss National Science Foundation) grantid: 225147 funderid: https://doi.org/10.13039/501100001711 – fundername: EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020) grantid: 101106377 funderid: https://doi.org/10.13039/501100010661 – fundername: Carl-Zeiss-Stiftung (Carl Zeiss Foundation) funderid: https://doi.org/10.13039/501100007569 – fundername: Ministry of Economy and Competitiveness | Agencia Estatal de Investigación (Spanish Agencia Estatal de Investigación) grantid: CNS2022-135474 – fundername: Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (Swiss National Science Foundation) grantid: 225147 – fundername: EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020) grantid: 101106377 – fundername: Deutsche Forschungsgemeinschaft (German Research Foundation) grantid: 497115849 |
GroupedDBID | --- 0R~ 123 29M 39C 4.4 53G 70F 7X7 88E 8AO 8FE 8FG 8FH 8FI 8FJ 8R4 8R5 AARCD AAYZH ABAWZ ABDBF ABFSG ABJCF ABJNI ABLJU ABNNU ABUWG ACBWK ACGFS ACIWK ACPRK ACRPL ACSTC ACUHS ADBBV ADNMO AENEX AEUYN AEZWR AFANA AFBBN AFHIU AFKRA AFSHS AFWHJ AGAYW AGGDT AGQPQ AHMBA AHOSX AHSBF AHWEU AIBTJ AIXLP AIYXT ALFFA ALIPV ALMA_UNASSIGNED_HOLDINGS ALPWD ARMCB ASPBG ATHPR AVWKF AXYYD AZFZN BBNVY BENPR BGLVJ BHPHI BKKNO BPHCQ BVXVI C6C CCPQU CS3 D1I DB5 DU5 EBS EE. EJD EMOBN ESX EXGXG F5P FEDTE FQGFK FSGXE FYUFA HCIFZ HMCUK HVGLF HZ~ KB. L-9 LK8 M1P M7P ML- NFIDA NNMJJ O9- ODYON P2P PDBOC PHGZM PHGZT PQQKQ PROAC PSQYO Q2X RNS RNT RNTTT SHXYY SIXXV SNYQT SOJ SV3 TAOOD TBHMF TDRGL TSG TUS UKHRP AAYXX CITATION CGR CUY CVF ECM EIF NPM 7QR 8FD FR3 K9. P64 7X8 5PM |
ID | FETCH-LOGICAL-c428t-eb02b774f1f6a24817dcffc6c9a12fd749ece651b75b9920730e83f2a89e65e43 |
IEDL.DBID | C6C |
ISSN | 1755-4330 1755-4349 |
IngestDate | Thu Aug 21 18:22:33 EDT 2025 Wed Jul 02 03:00:55 EDT 2025 Sat Aug 23 12:47:25 EDT 2025 Mon Jul 07 01:54:12 EDT 2025 Thu Jul 10 08:04:52 EDT 2025 Fri Jul 04 01:21:34 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 7 |
Language | English |
License | 2025. The Author(s). Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c428t-eb02b774f1f6a24817dcffc6c9a12fd749ece651b75b9920730e83f2a89e65e43 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ORCID | 0009-0007-4392-5918 0009-0009-3849-1670 0009-0000-1724-2725 0000-0003-4978-4670 0000-0003-3046-6576 0000-0002-7320-8529 0009-0008-5604-261X 0000-0002-5205-2852 0000-0002-5427-1610 0009-0008-2703-5613 0000-0002-2258-3011 0009-0003-1460-4710 0009-0000-3991-0704 0000-0002-5968-8706 0009-0007-1013-0502 0009-0000-0991-8967 0000-0003-4894-4660 0000-0003-4033-4235 0000-0001-9186-7045 0000-0002-9125-3067 0009-0003-5752-0154 0000-0002-5133-775X 0009-0001-5907-101X 0009-0001-9523-3290 0009-0002-4787-2757 0009-0001-0275-2586 0000-0002-9066-2715 0009-0001-7697-7315 0000-0002-5121-3853 0009-0004-0230-6115 0000-0003-1507-4048 0009-0001-1860-8132 |
OpenAccessLink | https://www.nature.com/articles/s41557-025-01815-x |
PMID | 40394186 |
PQID | 3226850375 |
PQPubID | 536302 |
PageCount | 8 |
ParticipantIDs | pubmedcentral_primary_oai_pubmedcentral_nih_gov_12226332 proquest_miscellaneous_3206236455 proquest_journals_3226850375 pubmed_primary_40394186 crossref_primary_10_1038_s41557_025_01815_x springer_journals_10_1038_s41557_025_01815_x |
PublicationCentury | 2000 |
PublicationDate | 2025-07-01 |
PublicationDateYYYYMMDD | 2025-07-01 |
PublicationDate_xml | – month: 07 year: 2025 text: 2025-07-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | London |
PublicationPlace_xml | – name: London – name: England |
PublicationTitle | Nature chemistry |
PublicationTitleAbbrev | Nat. Chem |
PublicationTitleAlternate | Nat Chem |
PublicationYear | 2025 |
Publisher | Nature Publishing Group UK Nature Publishing Group |
Publisher_xml | – name: Nature Publishing Group UK – name: Nature Publishing Group |
References | A M. Bran (1815_CR6) 2024; 6 1815_CR71 1815_CR26 1815_CR25 Z Zheng (1815_CR28) 2024; 3 1815_CR69 1815_CR24 1815_CR68 1815_CR23 1815_CR67 1815_CR22 1815_CR66 1815_CR21 1815_CR65 1815_CR20 1815_CR64 1815_CR63 RT McCoy (1815_CR10) 2024; 121 1815_CR29 Z Xie (1815_CR17) 2024; 15 TH Kung (1815_CR3) 2023; 2 T Dinh (1815_CR48) 2022; 35 1815_CR62 1815_CR61 1815_CR60 AD White (1815_CR14) 2023; 7 KM Jablonka (1815_CR15) 2023; 2 1815_CR59 1815_CR58 1815_CR13 1815_CR57 1815_CR12 1815_CR56 LP Argyle (1815_CR70) 2023; 31 1815_CR11 1815_CR55 J Dagdelen (1815_CR27) 2024; 15 1815_CR52 K Darvish (1815_CR7) 2025; 8 1815_CR19 1815_CR18 NC Frey (1815_CR47) 2023; 5 1815_CR2 1815_CR51 DA Boiko (1815_CR5) 2023; 624 1815_CR50 M Schilling-Wilhelmi (1815_CR32) 2025; 54 Z Wu (1815_CR49) 2018; 9 F Urbina (1815_CR37) 2022; 4 F Urbina (1815_CR40) 2022; 4 1815_CR45 1815_CR44 1815_CR43 1815_CR42 1815_CR41 O-H Choung (1815_CR72) 2023; 14 JH Caufield (1815_CR30) 2024; 40 X Cai (1815_CR46) 2024; 222 1815_CR4 T Brown (1815_CR1) 2020; 33 1815_CR9 1815_CR8 KM Jablonka (1815_CR16) 2024; 6 A Dunn (1815_CR53) 2020; 6 B Li (1815_CR73) 2023; 55 1815_CR36 1815_CR35 1815_CR34 1815_CR33 M Zaki (1815_CR54) 2024; 3 1815_CR31 1815_CR74 1815_CR39 1815_CR38 |
References_xml | – volume: 3 start-page: 491 year: 2024 ident: 1815_CR28 publication-title: Digit. Discov. doi: 10.1039/D3DD00239J – ident: 1815_CR61 – ident: 1815_CR59 – ident: 1815_CR65 – ident: 1815_CR38 doi: 10.48550/arXiv.2304.10510 – ident: 1815_CR55 doi: 10.18653/v1/2023.emnlp-main.468 – ident: 1815_CR18 doi: 10.48550/arXiv.2402.01439 – ident: 1815_CR26 doi: 10.26434/chemrxiv-2023-05v1b-v2 – volume: 6 start-page: 138 year: 2020 ident: 1815_CR53 publication-title: npj Comput. Mater. doi: 10.1038/s41524-020-00406-3 – volume: 624 start-page: 570 year: 2023 ident: 1815_CR5 publication-title: Nature doi: 10.1038/s41586-023-06792-0 – ident: 1815_CR39 doi: 10.48550/arXiv.2311.15936 – volume: 2 start-page: e0000198 year: 2023 ident: 1815_CR3 publication-title: PLoS Digit. Health doi: 10.1371/journal.pdig.0000198 – ident: 1815_CR42 – ident: 1815_CR51 – volume: 55 start-page: 1 year: 2023 ident: 1815_CR73 publication-title: ACM Comput. Surv. – ident: 1815_CR29 doi: 10.48550/arXiv.2312.07559 – volume: 40 start-page: btae104 year: 2024 ident: 1815_CR30 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btae104 – ident: 1815_CR8 doi: 10.48550/arXiv.2303.12712 – volume: 2 start-page: 1233 year: 2023 ident: 1815_CR15 publication-title: Digit. Discov. doi: 10.1039/D3DD00113J – ident: 1815_CR69 – volume: 121 start-page: e2322420121 year: 2024 ident: 1815_CR10 publication-title: Proc. Natl Acad. Sci. USA doi: 10.1073/pnas.2322420121 – volume: 8 start-page: 101897 year: 2025 ident: 1815_CR7 publication-title: Matter doi: 10.1016/j.matt.2024.10.015 – ident: 1815_CR11 doi: 10.48550/arXiv.2108.07258 – ident: 1815_CR4 doi: 10.48550/arXiv.2303.08774 – ident: 1815_CR68 – ident: 1815_CR2 doi: 10.48550/arXiv.2403.05075 – ident: 1815_CR43 – ident: 1815_CR60 – ident: 1815_CR20 doi: 10.48550/arXiv.2304.05341 – ident: 1815_CR31 doi: 10.18653/v1/2023.acl-long.753 – volume: 15 year: 2024 ident: 1815_CR27 publication-title: Nat. Commun. doi: 10.1038/s41467-024-45563-x – ident: 1815_CR33 doi: 10.48550/arXiv.2409.13740 – volume: 6 start-page: 525 year: 2024 ident: 1815_CR6 publication-title: Nat. Mach. Intell. doi: 10.1038/s42256-024-00832-8 – ident: 1815_CR34 doi: 10.48550/arXiv.2402.05200 – ident: 1815_CR23 doi: 10.48550/arXiv.2305.05708 – ident: 1815_CR36 doi: 10.48550/arXiv.2209.07858 – ident: 1815_CR25 doi: 10.48550/arXiv.2406.17295 – ident: 1815_CR71 – ident: 1815_CR58 doi: 10.18653/v1/2023.acl-long.201 – volume: 54 start-page: 1125 year: 2025 ident: 1815_CR32 publication-title: Chem. Soc. Rev. doi: 10.1039/D4CS00913D – ident: 1815_CR63 – ident: 1815_CR35 doi: 10.48550/arXiv.2310.18233 – ident: 1815_CR67 – volume: 5 start-page: 1297 year: 2023 ident: 1815_CR47 publication-title: Nat. Mach. Intell. doi: 10.1038/s42256-023-00740-3 – ident: 1815_CR44 doi: 10.48550/arXiv.2305.18365 – volume: 15 start-page: 500 year: 2024 ident: 1815_CR17 publication-title: Chem. Sci. doi: 10.1039/D3SC04610A – volume: 3 start-page: 313 year: 2024 ident: 1815_CR54 publication-title: Digit. Discov. doi: 10.1039/D3DD00188A – volume: 31 start-page: 337 year: 2023 ident: 1815_CR70 publication-title: Polit. Anal. doi: 10.1017/pan.2023.2 – ident: 1815_CR13 doi: 10.48550/arXiv.2311.07361 – ident: 1815_CR21 – volume: 35 start-page: 11763 year: 2022 ident: 1815_CR48 publication-title: Adv. Neural Inf. Process. Syst. – volume: 4 start-page: 607 year: 2022 ident: 1815_CR40 publication-title: Nat. Mach. Intell. doi: 10.1038/s42256-022-00511-6 – ident: 1815_CR64 doi: 10.48550/arXiv.2211.09085 – ident: 1815_CR19 doi: 10.48550/arXiv.2402.06852 – volume: 6 start-page: 161 year: 2024 ident: 1815_CR16 publication-title: Nat. Mach. Intell. doi: 10.1038/s42256-023-00788-1 – ident: 1815_CR57 – volume: 4 start-page: 189 year: 2022 ident: 1815_CR37 publication-title: Nat. Mach. Intell. doi: 10.1038/s42256-022-00465-9 – ident: 1815_CR74 doi: 10.5281/zenodo.14010212 – ident: 1815_CR62 – volume: 9 start-page: 513 year: 2018 ident: 1815_CR49 publication-title: Chem. Sci. doi: 10.1039/C7SC02664A – ident: 1815_CR24 – ident: 1815_CR41 – ident: 1815_CR45 doi: 10.48550/arXiv.2209.01712 – ident: 1815_CR9 doi: 10.1145/3442188.3445922 – volume: 222 start-page: 133 year: 2024 ident: 1815_CR46 publication-title: Methods doi: 10.1016/j.ymeth.2024.01.004 – volume: 14 year: 2023 ident: 1815_CR72 publication-title: Nat. Commun. doi: 10.1038/s41467-023-42242-1 – ident: 1815_CR12 doi: 10.48550/arXiv.2307.03718 – ident: 1815_CR22 doi: 10.48550/arXiv.2310.14029 – volume: 33 start-page: 1877 year: 2020 ident: 1815_CR1 publication-title: Adv. Neural Inf. Process. Syst. – ident: 1815_CR50 doi: 10.48550/arXiv.2409.13989 – ident: 1815_CR66 doi: 10.48550/arXiv.2205.00445 – ident: 1815_CR52 doi: 10.48550/arXiv.2411.16736 – volume: 7 start-page: 457 year: 2023 ident: 1815_CR14 publication-title: Nat. Rev. Chem. doi: 10.1038/s41570-023-00502-0 – ident: 1815_CR56 doi: 10.18653/v1/2023.findings-emnlp.380 |
SSID | ssj0065316 |
Score | 2.4957294 |
Snippet | Large language models (LLMs) have gained widespread interest owing to their ability to process human language and perform tasks on which they have not been... |
SourceID | pubmedcentral proquest pubmed crossref springer |
SourceType | Open Access Repository Aggregation Database Index Database Publisher |
StartPage | 1027 |
SubjectTerms | 639/638/630 639/638/899 Analytical Chemistry Benchmarks Biochemistry Chemical reactions Chemistry Chemistry - education Chemistry and Materials Science Chemistry/Food Science Chemists Design Humans Inorganic Chemistry Knowledge Language Large Language Models Organic Chemistry Performance evaluation Physical Chemistry Reasoning Toxicity |
Title | A framework for evaluating the chemical knowledge and reasoning abilities of large language models against the expertise of chemists |
URI | https://link.springer.com/article/10.1038/s41557-025-01815-x https://www.ncbi.nlm.nih.gov/pubmed/40394186 https://www.proquest.com/docview/3226850375 https://www.proquest.com/docview/3206236455 https://pubmed.ncbi.nlm.nih.gov/PMC12226332 |
Volume | 17 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3BbtQwEB1V7aFcKqCFBkplJG7FInFsxz5uV6wqJHqiUm9R7NilEspW3a3UD-DDmXHiRUvhwC1KRo6dmdjP45k3AB9sFa1wsuLReskRgQfeaddz60rb97L3MZK_4-ulvriSX67V9Q6InAuTgvYTpWWapnN02KcVLXwNp-KrRDGlOOLGPaJuJ6ue63mefTXaVMooapSibKBySpQpa_OXNrYXoycI82mg5B-npWkRWjyHgwk9stnY3xewE4aXsD_PRdsO4eeMxRxuxRCPsszmPdwwhHrMT_wAbONLY93QM4pMT35ZNtJ24_aZLSP7QWHiLLs0Waqas2LdTXeLoDK1lwoEUD1nEvdjP1ZHcLX4_G1-wadCC9zj7mPNgyuFQxwYq6g7IU3VkI689rarROwbaYMPWlWuUc5aQbNCMHUUnbF4O8j6FewOyyEcA0O0gLr3TjkEWggP8EJ5BIGx6ZVphCngLH_x9m7k02jTOXht2lE_LeqnTfppHws4yUppp39r1eIUpI2i2r0FvN88xvHRUUc3hOUDyZSaqPEVyrwedbh5nSxrKyujCzBb2t0IEOP29pPh9nti3q4QTem6FgV8zIbwu1__Hsab_xN_C89EMlIKCj6B3fX9Q3iH0GftTmFvtjg_vzxNNv8L4asB_Q |
linkProvider | Springer Nature |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Nb9QwEB1V7aFcEOUz0IKRuIFF4tiOfaxWVAu0PbVSb1Hs2KVSlUXsVuoP6A9nxom3WgoHblEycuzMxH4ez7wB-GCraIWTFY_WS44IPPBOu55bV9q-l72PkfwdJ6d6fi6_XaiLLRA5FyYF7SdKyzRN5-iwz0ta-BpOxVeJYkpxxI07iLU1hXHN9CzPvhptKmUUNUpRNlA5JcqUtflLG5uL0QOE-TBQ8o_T0rQIHT2BxxN6ZIdjf_dgKwxPYXeWi7Y9g7tDFnO4FUM8yjKb93DJEOoxP_EDsLUvjXVDzygyPfll2UjbjdtntojsmsLEWXZpslQ1Z8m6y-4KQWVqLxUIoHrOJO7Hfiyfw_nRl7PZnE-FFrjH3ceKB1cKhzgwVlF3QpqqIR157W1Xidg30gYftKpco5y1gmaFYOooOmPxdpD1C9geFkN4BQzRAureO-UQaCE8wAvlEQTGplemEaaAj_mLtz9HPo02nYPXph3106J-2qSf9raA_ayUdvq3li1OQdooqt1bwPv1YxwfHXV0Q1jckEypiRpfoczLUYfr18mytrIyugCzod21ADFubz4Zrn4k5u0K0ZSua1HAp2wI9_369zBe_5_4O9idn50ct8dfT7-_gUciGSwFCO_D9urXTThAGLRyb5Pd_wYOAANw |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwEB6VIgEXVF4lUMBI3MCQh-3Yx2phVV4VByr1ZsWOXSpV2YrdSv0B_HBmnHirpXDgFiUjx85M7M_jmW8AXpkqmtqJikfjBUcEHninXM-NK03fi97HSP6Or4fq4Eh8OpbHW6ByLkwK2k-UlmmaztFh75a08LWciq8SxZTkl2_P-3gDbiLeLmnTNVOzPAMrtKuUVdRKSRlB5ZQsUzb6L-1sLkjXUOb1YMk_TkzTQjTfgbsTgmT7Y5_vwVYY7sPtWS7c9gB-7bOYQ64YYlKWGb2HE4Zwj_mJI4Ct_WmsG3pG0enJN8tG6m7cQrNFZGcUKs6yW5OlyjlL1p10pwgsU3upSADVdCZxP_Zj-RCO5h--zw74VGyBe9yBrHhwZe0QC8Yqqq4WumpJT15501V17Fthgg9KVq6VzpiaZoagm1h32uDtIJpHsD0shvAYGCIG1L930iHYQoiAF9IjEIxtL3Vb6wJe5y9uz0dODZvOwhttR_1Y1I9N-rGXBexlpdjp_1panIaUllS_t4CX68c4Pjru6IawuCCZUhE9vkSZ3VGH69eJsjGi0qoAvaHdtQCxbm8-GU5_JPbtChGVapq6gDfZEK769e9hPPk_8Rdw69v7uf3y8fDzU7hTJ3ulGOE92F79vAjPEAmt3PNk9r8BtRgEeQ |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+framework+for+evaluating+the+chemical+knowledge+and+reasoning+abilities+of+large+language+models+against+the+expertise+of+chemists&rft.jtitle=Nature+chemistry&rft.au=Mirza%2C+Adrian&rft.au=Alampara%2C+Nawaf&rft.au=Kunchapu%2C+Sreekanth&rft.au=R%C3%ADos-Garc%C3%ADa%2C+Marti%C3%B1o&rft.date=2025-07-01&rft.eissn=1755-4349&rft.volume=17&rft.issue=7&rft.spage=1027&rft_id=info:doi/10.1038%2Fs41557-025-01815-x&rft_id=info%3Apmid%2F40394186&rft.externalDocID=40394186 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1755-4330&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1755-4330&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1755-4330&client=summon |