Large language model enhanced corpus of CO2 reduction electrocatalysts and synthesis procedures
CO 2 electroreduction has garnered significant attention from both the academic and industrial communities. Extracting crucial information related to catalysts from domain literature can help scientists find new and effective electrocatalysts. Herein, we used various advanced machine learning, natur...
Saved in:
Published in | Scientific data Vol. 11; no. 1; pp. 347 - 12 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
London
Nature Publishing Group UK
06.04.2024
Nature Publishing Group Nature Portfolio |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | CO
2
electroreduction has garnered significant attention from both the academic and industrial communities. Extracting crucial information related to catalysts from domain literature can help scientists find new and effective electrocatalysts. Herein, we used various advanced machine learning, natural language processing techniques and large language models (LLMs) approaches to extract relevant information about the CO
2
electrocatalytic reduction process from scientific literature. By applying the extraction pipeline, we present an open-source corpus for electrocatalytic CO
2
reduction. The database contains two types of corpus: (1) the benchmark corpus, which is a collection of 6,985 records extracted from 1,081 publications by catalysis postgraduates; and (2) the extended corpus, which consists of content extracted from 5,941 documents using traditional NLP techniques and LLMs techniques. The Extended Corpus I and II contain 77,016 and 30,283 records, respectively. Furthermore, several domain literature fine-tuned LLMs were developed. Overall, this work will contribute to the exploration of new and effective electrocatalysts by leveraging information from domain literature using cutting-edge computer techniques. |
---|---|
AbstractList | CO
2
electroreduction has garnered significant attention from both the academic and industrial communities. Extracting crucial information related to catalysts from domain literature can help scientists find new and effective electrocatalysts. Herein, we used various advanced machine learning, natural language processing techniques and large language models (LLMs) approaches to extract relevant information about the CO
2
electrocatalytic reduction process from scientific literature. By applying the extraction pipeline, we present an open-source corpus for electrocatalytic CO
2
reduction. The database contains two types of corpus: (1) the benchmark corpus, which is a collection of 6,985 records extracted from 1,081 publications by catalysis postgraduates; and (2) the extended corpus, which consists of content extracted from 5,941 documents using traditional NLP techniques and LLMs techniques. The Extended Corpus I and II contain 77,016 and 30,283 records, respectively. Furthermore, several domain literature fine-tuned LLMs were developed. Overall, this work will contribute to the exploration of new and effective electrocatalysts by leveraging information from domain literature using cutting-edge computer techniques. Abstract CO2 electroreduction has garnered significant attention from both the academic and industrial communities. Extracting crucial information related to catalysts from domain literature can help scientists find new and effective electrocatalysts. Herein, we used various advanced machine learning, natural language processing techniques and large language models (LLMs) approaches to extract relevant information about the CO2 electrocatalytic reduction process from scientific literature. By applying the extraction pipeline, we present an open-source corpus for electrocatalytic CO2 reduction. The database contains two types of corpus: (1) the benchmark corpus, which is a collection of 6,985 records extracted from 1,081 publications by catalysis postgraduates; and (2) the extended corpus, which consists of content extracted from 5,941 documents using traditional NLP techniques and LLMs techniques. The Extended Corpus I and II contain 77,016 and 30,283 records, respectively. Furthermore, several domain literature fine-tuned LLMs were developed. Overall, this work will contribute to the exploration of new and effective electrocatalysts by leveraging information from domain literature using cutting-edge computer techniques. CO2 electroreduction has garnered significant attention from both the academic and industrial communities. Extracting crucial information related to catalysts from domain literature can help scientists find new and effective electrocatalysts. Herein, we used various advanced machine learning, natural language processing techniques and large language models (LLMs) approaches to extract relevant information about the CO2 electrocatalytic reduction process from scientific literature. By applying the extraction pipeline, we present an open-source corpus for electrocatalytic CO2 reduction. The database contains two types of corpus: (1) the benchmark corpus, which is a collection of 6,985 records extracted from 1,081 publications by catalysis postgraduates; and (2) the extended corpus, which consists of content extracted from 5,941 documents using traditional NLP techniques and LLMs techniques. The Extended Corpus I and II contain 77,016 and 30,283 records, respectively. Furthermore, several domain literature fine-tuned LLMs were developed. Overall, this work will contribute to the exploration of new and effective electrocatalysts by leveraging information from domain literature using cutting-edge computer techniques.CO2 electroreduction has garnered significant attention from both the academic and industrial communities. Extracting crucial information related to catalysts from domain literature can help scientists find new and effective electrocatalysts. Herein, we used various advanced machine learning, natural language processing techniques and large language models (LLMs) approaches to extract relevant information about the CO2 electrocatalytic reduction process from scientific literature. By applying the extraction pipeline, we present an open-source corpus for electrocatalytic CO2 reduction. The database contains two types of corpus: (1) the benchmark corpus, which is a collection of 6,985 records extracted from 1,081 publications by catalysis postgraduates; and (2) the extended corpus, which consists of content extracted from 5,941 documents using traditional NLP techniques and LLMs techniques. The Extended Corpus I and II contain 77,016 and 30,283 records, respectively. Furthermore, several domain literature fine-tuned LLMs were developed. Overall, this work will contribute to the exploration of new and effective electrocatalysts by leveraging information from domain literature using cutting-edge computer techniques. CO2 electroreduction has garnered significant attention from both the academic and industrial communities. Extracting crucial information related to catalysts from domain literature can help scientists find new and effective electrocatalysts. Herein, we used various advanced machine learning, natural language processing techniques and large language models (LLMs) approaches to extract relevant information about the CO2 electrocatalytic reduction process from scientific literature. By applying the extraction pipeline, we present an open-source corpus for electrocatalytic CO2 reduction. The database contains two types of corpus: (1) the benchmark corpus, which is a collection of 6,985 records extracted from 1,081 publications by catalysis postgraduates; and (2) the extended corpus, which consists of content extracted from 5,941 documents using traditional NLP techniques and LLMs techniques. The Extended Corpus I and II contain 77,016 and 30,283 records, respectively. Furthermore, several domain literature fine-tuned LLMs were developed. Overall, this work will contribute to the exploration of new and effective electrocatalysts by leveraging information from domain literature using cutting-edge computer techniques. |
ArticleNumber | 347 |
Author | Chen, Xueqing Huang, Jiamin Wang, Ludi Gao, Yang Du, Yi Cui, Wenjuan Wang, Bin |
Author_xml | – sequence: 1 givenname: Xueqing orcidid: 0009-0008-8926-9626 surname: Chen fullname: Chen, Xueqing organization: Laboratory of Big Data Knowledge, Computer Network Information Center, Chinese Academy of Sciences, University of Chinese Academy of Sciences – sequence: 2 givenname: Yang orcidid: 0000-0002-3451-1904 surname: Gao fullname: Gao, Yang organization: CAS Key Laboratory of Nanosystem and Hierarchical Fabrication, National Center for Nanoscience and Technology (NCNST) – sequence: 3 givenname: Ludi orcidid: 0000-0002-9346-6250 surname: Wang fullname: Wang, Ludi organization: Laboratory of Big Data Knowledge, Computer Network Information Center, Chinese Academy of Sciences – sequence: 4 givenname: Wenjuan orcidid: 0000-0002-1858-8194 surname: Cui fullname: Cui, Wenjuan organization: Laboratory of Big Data Knowledge, Computer Network Information Center, Chinese Academy of Sciences – sequence: 5 givenname: Jiamin surname: Huang fullname: Huang, Jiamin organization: CAS Key Laboratory of Nanosystem and Hierarchical Fabrication, National Center for Nanoscience and Technology (NCNST) – sequence: 6 givenname: Yi orcidid: 0000-0003-3121-8937 surname: Du fullname: Du, Yi email: duyi@cnic.cn organization: Laboratory of Big Data Knowledge, Computer Network Information Center, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Hangzhou Institute for Advanced Study, UCAS – sequence: 7 givenname: Bin orcidid: 0000-0001-9576-2646 surname: Wang fullname: Wang, Bin email: wangb@nanoctr.cn organization: CAS Key Laboratory of Nanosystem and Hierarchical Fabrication, National Center for Nanoscience and Technology (NCNST) |
BookMark | eNp9kk1r3DAQhk1JoWmaP9CToJde3Iy-LOlUytKPwEIu7VnI0njXi1faSnZh_32VOLRNDj1p0DzzMAzv6-YipohN85bCBwpc3xRBpVEtMNECpxpa86K5ZCBZK0THL_6pXzXXpRwAgHIBUsFlY7cu75BMLu4WV4tjCjgRjHsXPQbiUz4thaSBbO4YyRgWP48pEpzQzzl5N7vpXOZCXAyknOO8xzIWcqqtymYsb5qXg5sKXj--V82PL5-_b76127uvt5tP29aLzsytVMx1dSOhaBiUAGMoDyhQSiaFoIFx0AoQOINBYRd6ximntO-p6GSvgV81t6s3JHewpzweXT7b5Eb78JHyzro8j35Ca4zskOKgeq0F9MZpI7liPfM96qCH6vq4uk5Lf8TgMc7ZTU-kTztx3Ntd-mVp3VtrLqrh_aMhp58Lltkex-JxqmfGtBTLgQsmuspW9N0z9JCWHOut7iluOO0UrRRbKZ9TKRmHP9tQsPchsGsIbA2BfQiBNXWIr0OlwnGH-a_6P1O_AUQHtb0 |
Cites_doi | 10.1038/s41467-020-17266-6 10.1038/s41586-020-2242-8 10.1038/s41597-019-0224-1 10.1038/s41597-020-00602-2 10.1093/bioinformatics/btp535 10.1038/s41597-023-02089-z 10.1039/C3CS60323G 10.1023/A:1010933404324 10.1002/adma.201802066 10.1021/acscatal.3c00759 10.1021/acs.chemmater.1c02961 10.1038/s41586-018-0337-2 10.57760/sciencedb.13290 10.57760/sciencedb.13292 10.1038/s41597-022-01321-6 10.1162/neco.1997.9.8.1735 10.1038/s41578-022-00466-5 10.57760/sciencedb.13293 10.1038/s41524-019-0204-1 10.1021/acs.chemmater.0c02553 10.1021/acs.jcim.6b00207 10.1038/s41560-019-0450-y 10.1021/acs.jcim.0c00199 10.1186/s13054-023-04393-x 10.18653/v1/P16-1101 10.18653/v1/N16-1030 10.1007/11875741_11 10.18653/v1/2023.ijcnlp-main.45 10.1038/s41467-024-45914-8 10.1021/jacs.3c05819 10.18653/v1/D19-1371 10.18653/v1/P16-2067 10.7759/cureus.35179 10.3115/1220575.1220634 10.18653/v1/D15-1162 |
ContentType | Journal Article |
Copyright | The Author(s) 2024 The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. 2024. The Author(s). |
Copyright_xml | – notice: The Author(s) 2024 – notice: The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. – notice: 2024. The Author(s). |
DBID | C6C AAYXX CITATION 3V. 7X7 7XB 88E 8FE 8FH 8FI 8FJ 8FK ABUWG AFKRA AZQEC BBNVY BENPR BHPHI CCPQU DWQXO FYUFA GHDGH GNUQQ HCIFZ K9. LK8 M0S M1P M7P PHGZM PHGZT PIMPY PJZUB PKEHL PPXIY PQEST PQGLB PQQKQ PQUKI PRINS 7X8 5PM DOA |
DOI | 10.1038/s41597-024-03180-9 |
DatabaseName | Springer Nature Link CrossRef ProQuest Central (Corporate) Health & Medical Collection ProQuest Central (purchase pre-March 2016) Medical Database (Alumni Edition) ProQuest SciTech Collection ProQuest Natural Science Collection Hospital Premium Collection Hospital Premium Collection (Alumni Edition) ProQuest Central (Alumni) (purchase pre-March 2016) ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials Biological Science Collection ProQuest Central Natural Science Collection ProQuest One Community College ProQuest Central Korea Health Research Premium Collection Health Research Premium Collection (Alumni) ProQuest Central Student SciTech Premium Collection ProQuest Health & Medical Complete (Alumni) Biological Sciences ProQuest Health & Medical Collection Medical Database Biological Science Database ProQuest Central Premium ProQuest One Academic (New) ProQuest Publicly Available Content Database ProQuest Health & Medical Research Collection ProQuest One Academic Middle East (New) ProQuest One Health & Nursing ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China MEDLINE - Academic PubMed Central (Full Participant titles) DOAJ Directory of Open Access Journals |
DatabaseTitle | CrossRef Publicly Available Content Database ProQuest Central Student ProQuest One Academic Middle East (New) ProQuest Central Essentials ProQuest Health & Medical Complete (Alumni) ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest One Health & Nursing ProQuest Natural Science Collection ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest Health & Medical Research Collection Health Research Premium Collection Health and Medicine Complete (Alumni Edition) Natural Science Collection ProQuest Central Korea Health & Medical Research Collection Biological Science Collection ProQuest Central (New) ProQuest Medical Library (Alumni) ProQuest Biological Science Collection ProQuest One Academic Eastern Edition ProQuest Hospital Collection Health Research Premium Collection (Alumni) Biological Science Database ProQuest SciTech Collection ProQuest Hospital Collection (Alumni) ProQuest Health & Medical Complete ProQuest Medical Library ProQuest One Academic UKI Edition ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni) MEDLINE - Academic |
DatabaseTitleList | MEDLINE - Academic CrossRef Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: C6C name: Springer Nature OA Free Journals url: http://www.springeropen.com/ sourceTypes: Publisher – sequence: 2 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 3 dbid: BENPR name: ProQuest Central (New) url: https://www.proquest.com/central sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Sciences (General) |
EISSN | 2052-4463 |
EndPage | 12 |
ExternalDocumentID | oai_doaj_org_article_9956e1ef7b8840b9a895372b2cbe8d8f PMC10998834 10_1038_s41597_024_03180_9 |
GrantInformation_xml | – fundername: National Natural Science Foundation of China (National Science Foundation of China) grantid: T2322027 funderid: 501100001809 – fundername: the National Key Research and Development Plan of China under Grant No. 2022YFF0711900 – fundername: the National Key Research and Development Plan of China under Grant No. 2021YFA1202802 the CAS Pioneer Hundred Talents Program – fundername: the National Key Research and Development Plan of China under Grant No.2022YFF0712200 Information Science Database in National Basic Science Data Center under Grant No.NBSDC-DB-25 – fundername: Youth Innovation Promotion Association of the Chinese Academy of Sciences (Youth Innovation Promotion Association CAS) funderid: 501100004739 – fundername: the Young Elite Scientists Sponsorship Program by Beijing Association for Science and Technology (BYESS2023410) |
GroupedDBID | 0R~ 3V. 53G 5VS 7X7 88E 8FE 8FH 8FI 8FJ AAJSJ ABUWG ACGFS ACSFO ACSMW ADBBV ADRAZ AFKRA AGHDO AJTQC ALIPV ALMA_UNASSIGNED_HOLDINGS AOIJS BBNVY BCNDV BENPR BHPHI BPHCQ BVXVI C6C CCPQU DIK EBLON EBS EJD FYUFA GROUPED_DOAJ HCIFZ HMCUK HYE KQ8 LK8 M1P M48 M7P M~E NAO OK1 PGMZT PIMPY PQQKQ PROAC PSQYO RNT RNTTT RPM SNYQT UKHRP AASML AAYXX CITATION PHGZM PHGZT 7XB 8FK AARCD AZQEC DWQXO GNUQQ K9. PJZUB PKEHL PPXIY PQEST PQGLB PQUKI PRINS 7X8 5PM PUEGO |
ID | FETCH-LOGICAL-c469t-572a6057471df7409913de4e5525441d230870e0320f7e6db231311bb1465b803 |
IEDL.DBID | C6C |
ISSN | 2052-4463 |
IngestDate | Wed Aug 27 01:25:06 EDT 2025 Thu Aug 21 18:34:42 EDT 2025 Fri Jul 11 02:45:30 EDT 2025 Wed Aug 13 09:51:15 EDT 2025 Tue Jul 01 00:39:02 EDT 2025 Fri Feb 21 02:39:06 EST 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Language | English |
License | Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c469t-572a6057471df7409913de4e5525441d230870e0320f7e6db231311bb1465b803 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Undefined-1 ObjectType-Feature-3 content type line 23 |
ORCID | 0009-0008-8926-9626 0000-0002-1858-8194 0000-0002-9346-6250 0000-0001-9576-2646 0000-0003-3121-8937 0000-0002-3451-1904 |
OpenAccessLink | https://www.nature.com/articles/s41597-024-03180-9 |
PQID | 3033931671 |
PQPubID | 2041912 |
PageCount | 12 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_9956e1ef7b8840b9a895372b2cbe8d8f pubmedcentral_primary_oai_pubmedcentral_nih_gov_10998834 proquest_miscellaneous_3034246883 proquest_journals_3033931671 crossref_primary_10_1038_s41597_024_03180_9 springer_journals_10_1038_s41597_024_03180_9 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2024-04-06 |
PublicationDateYYYYMMDD | 2024-04-06 |
PublicationDate_xml | – month: 04 year: 2024 text: 2024-04-06 day: 06 |
PublicationDecade | 2020 |
PublicationPlace | London |
PublicationPlace_xml | – name: London |
PublicationTitle | Scientific data |
PublicationTitleAbbrev | Sci Data |
PublicationYear | 2024 |
Publisher | Nature Publishing Group UK Nature Publishing Group Nature Portfolio |
Publisher_xml | – name: Nature Publishing Group UK – name: Nature Publishing Group – name: Nature Portfolio |
References | Wang (CR41) 2023 CR39 CR38 Zhong (CR2) 2020; 581 Butler, Davies, Cartwright, Isayev, Walsh (CR6) 2018; 559 CR34 CR33 CR31 CR30 Swain, Cole (CR37) 2016; 56 Huo (CR23) 2019; 5 Pedregosa (CR49) 2011; 12 Gao, Wang, Chen, Du, Wang (CR3) 2023; 13 Wang (CR40) 2023 CR48 CR47 CR46 CR45 CR44 CR43 He (CR8) 2020; 32 Brown (CR28) 2020; 33 Kononova (CR11) 2019; 6 Hiszpanski (CR20) 2020; 60 Breiman (CR25) 2001; 45 Paula (CR10) 2022; 34 Huang, Cole (CR9) 2020; 7 Blei, Ng, Jordan (CR24) 2003; 3 Zheng, Jiang, Wang (CR5) 2018; 30 Hettne (CR35) 2009; 25 Wang (CR12) 2023; 10 CR19 CR18 CR17 CR15 CR14 CR13 Peng (CR7) 2022; 7 Birdja (CR1) 2019; 4 Radford (CR29) 2019; 1 Qiao, Liu, Hong, Zhang (CR4) 2014; 43 Azamfirei, Kudchadkar, Fackler (CR16) 2023; 27 Vaucher (CR36) 2020; 11 CR27 CR26 Cruse (CR22) 2022; 9 Hochreiter, Schmidhuber (CR32) 1997; 9 CR21 Wang (CR42) 2023 3180_CR34 DM Blei (3180_CR24) 2003; 3 3180_CR33 L Wang (3180_CR12) 2023; 10 AM Hiszpanski (3180_CR20) 2020; 60 3180_CR39 3180_CR38 3180_CR31 3180_CR30 K Cruse (3180_CR22) 2022; 9 J Qiao (3180_CR4) 2014; 43 T He (3180_CR8) 2020; 32 3180_CR47 3180_CR46 3180_CR45 3180_CR44 A Radford (3180_CR29) 2019; 1 3180_CR48 MC Swain (3180_CR37) 2016; 56 3180_CR43 S Hochreiter (3180_CR32) 1997; 9 KT Butler (3180_CR6) 2018; 559 L Wang (3180_CR42) 2023 L Breiman (3180_CR25) 2001; 45 YY Birdja (3180_CR1) 2019; 4 M Zhong (3180_CR2) 2020; 581 F Pedregosa (3180_CR49) 2011; 12 3180_CR14 3180_CR13 3180_CR18 3180_CR17 3180_CR15 T Brown (3180_CR28) 2020; 33 AJ Paula (3180_CR10) 2022; 34 Y Gao (3180_CR3) 2023; 13 H Huo (3180_CR23) 2019; 5 L Wang (3180_CR41) 2023 O Kononova (3180_CR11) 2019; 6 KM Hettne (3180_CR35) 2009; 25 L Wang (3180_CR40) 2023 3180_CR27 3180_CR26 R Azamfirei (3180_CR16) 2023; 27 3180_CR21 T Zheng (3180_CR5) 2018; 30 S Huang (3180_CR9) 2020; 7 AC Vaucher (3180_CR36) 2020; 11 J Peng (3180_CR7) 2022; 7 3180_CR19 |
References_xml | – volume: 11 year: 2020 ident: CR36 article-title: Automated extraction of chemical synthesis actions from experimental procedures publication-title: Nat. Commun. doi: 10.1038/s41467-020-17266-6 – ident: CR45 – ident: CR39 – volume: 581 start-page: 178 year: 2020 end-page: 183 ident: CR2 article-title: Accelerated discovery of CO electrocatalysts using active machine learning publication-title: Nature doi: 10.1038/s41586-020-2242-8 – volume: 6 year: 2019 ident: CR11 article-title: Text-mined dataset of inorganic materials synthesis recipes publication-title: Sci. data doi: 10.1038/s41597-019-0224-1 – volume: 7 year: 2020 ident: CR9 article-title: A database of battery materials auto-generated using ChemDataExtractor publication-title: Sci. Data doi: 10.1038/s41597-020-00602-2 – ident: CR21 – ident: CR46 – ident: CR19 – volume: 25 start-page: 2983 year: 2009 end-page: 2991 ident: CR35 article-title: A dictionary to identify small molecules and drugs in free text publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp535 – ident: CR15 – volume: 12 start-page: 2825 year: 2011 end-page: 2830 ident: CR49 article-title: Scikit-learn: Machine learning in python publication-title: J. Mach. Learn. Res. – volume: 10 year: 2023 ident: CR12 article-title: A corpus of CO2 electrocatalytic reduction process extracted from the scientific literature publication-title: Sci. Data doi: 10.1038/s41597-023-02089-z – volume: 33 start-page: 1877 year: 2020 end-page: 1901 ident: CR28 article-title: Language models are few-shot learners publication-title: Advances in neural information processing systems – volume: 43 start-page: 631 year: 2014 end-page: 675 ident: CR4 article-title: A review of catalysts for the electroreduction of carbon dioxide to produce low-carbon fuels publication-title: Chem. Soc. Rev. doi: 10.1039/C3CS60323G – volume: 45 start-page: 5 year: 2001 end-page: 32 ident: CR25 article-title: Random forests publication-title: Mach. Learn. doi: 10.1023/A:1010933404324 – volume: 30 start-page: 1802066 year: 2018 ident: CR5 article-title: Recent advances in electrochemical CO2-to-CO conversion on heterogeneous catalysts publication-title: Adv. Mater. doi: 10.1002/adma.201802066 – volume: 13 start-page: 8525 year: 2023 end-page: 8534 ident: CR3 article-title: Revisiting electrocatalyst design by a knowledge graph of Cu-based catalysts for CO reduction publication-title: ACS Catal. doi: 10.1021/acscatal.3c00759 – ident: CR26 – volume: 34 start-page: 979 year: 2022 end-page: 990 ident: CR10 article-title: Machine learning and natural language processing enable a data-oriented experimental design approach for producing biochar and hydrochar from biomass publication-title: Chem. Mater. doi: 10.1021/acs.chemmater.1c02961 – volume: 559 start-page: 547 year: 2018 end-page: 555 ident: CR6 article-title: Machine learning for molecular and materials science publication-title: Nature doi: 10.1038/s41586-018-0337-2 – year: 2023 ident: CR40 publication-title: ScienceDB doi: 10.57760/sciencedb.13290 – ident: CR18 – ident: CR43 – ident: CR47 – year: 2023 ident: CR41 publication-title: ScienceDB doi: 10.57760/sciencedb.13292 – ident: CR14 – ident: CR30 – volume: 9 year: 2022 ident: CR22 article-title: Text-mined dataset of gold nanoparticle synthesis procedures, morphologies, and size entities publication-title: Sci. Data doi: 10.1038/s41597-022-01321-6 – volume: 9 start-page: 1735 year: 1997 end-page: 1780 ident: CR32 article-title: J. Long short-term memory publication-title: Neural Comput. doi: 10.1162/neco.1997.9.8.1735 – ident: CR33 – volume: 7 start-page: 991 year: 2022 end-page: 1009 ident: CR7 article-title: Human- and machine-centred designs of molecules and materials for sustainability and decarbonization publication-title: Nat. Rev. Mater. doi: 10.1038/s41578-022-00466-5 – volume: 1 start-page: 9 year: 2019 ident: CR29 article-title: Language models are unsupervised multitask learners publication-title: OpenAI blog – ident: CR27 – year: 2023 ident: CR42 publication-title: ScienceDB doi: 10.57760/sciencedb.13293 – volume: 5 year: 2019 ident: CR23 article-title: Semi-supervised machine-learning classification of materials synthesis procedures publication-title: npj Comput. Mater. doi: 10.1038/s41524-019-0204-1 – ident: CR44 – ident: CR48 – ident: CR38 – volume: 32 start-page: 7861 year: 2020 end-page: 7873 ident: CR8 article-title: Similarity of precursors in solid-state synthesis as text-mined from scientific literature publication-title: Chem. Mater. doi: 10.1021/acs.chemmater.0c02553 – ident: CR17 – ident: CR31 – ident: CR13 – ident: CR34 – volume: 3 start-page: 993 year: 2003 end-page: 1022 ident: CR24 article-title: Latent dirichlet allocation publication-title: J. Mach. Learn. Res. – volume: 56 start-page: 1894 year: 2016 end-page: 1904 ident: CR37 article-title: ChemDataExtractor: a toolkit for automated extraction of chemical information from the scientific literature publication-title: J. Chem. Inf. Model. doi: 10.1021/acs.jcim.6b00207 – volume: 4 start-page: 732 year: 2019 end-page: 745 ident: CR1 article-title: Advances and challenges in understanding the electrocatalytic conversion of carbon dioxide to fuels publication-title: Nat. Energy doi: 10.1038/s41560-019-0450-y – volume: 60 start-page: 2876 year: 2020 end-page: 2887 ident: CR20 article-title: Nanomaterial synthesis insights from machine learning of scientific articles by extracting, structuring, and visualizing knowledge publication-title: J. Chem. Inf. Model doi: 10.1021/acs.jcim.0c00199 – volume: 27 start-page: 1 year: 2023 end-page: 2 ident: CR16 article-title: Large language models and the perils of their hallucinations publication-title: Crit. Care doi: 10.1186/s13054-023-04393-x – year: 2023 ident: 3180_CR40 doi: 10.57760/sciencedb.13290 – ident: 3180_CR13 – ident: 3180_CR43 doi: 10.18653/v1/P16-1101 – volume: 7 start-page: 991 year: 2022 ident: 3180_CR7 publication-title: Nat. Rev. Mater. doi: 10.1038/s41578-022-00466-5 – ident: 3180_CR26 – ident: 3180_CR45 – volume: 30 start-page: 1802066 year: 2018 ident: 3180_CR5 publication-title: Adv. Mater. doi: 10.1002/adma.201802066 – volume: 4 start-page: 732 year: 2019 ident: 3180_CR1 publication-title: Nat. Energy doi: 10.1038/s41560-019-0450-y – volume: 581 start-page: 178 year: 2020 ident: 3180_CR2 publication-title: Nature doi: 10.1038/s41586-020-2242-8 – volume: 9 start-page: 1735 year: 1997 ident: 3180_CR32 publication-title: Neural Comput. doi: 10.1162/neco.1997.9.8.1735 – volume: 6 year: 2019 ident: 3180_CR11 publication-title: Sci. data doi: 10.1038/s41597-019-0224-1 – volume: 32 start-page: 7861 year: 2020 ident: 3180_CR8 publication-title: Chem. Mater. doi: 10.1021/acs.chemmater.0c02553 – ident: 3180_CR33 doi: 10.18653/v1/N16-1030 – volume: 11 year: 2020 ident: 3180_CR36 publication-title: Nat. Commun. doi: 10.1038/s41467-020-17266-6 – ident: 3180_CR34 doi: 10.1007/11875741_11 – ident: 3180_CR15 doi: 10.18653/v1/2023.ijcnlp-main.45 – volume: 559 start-page: 547 year: 2018 ident: 3180_CR6 publication-title: Nature doi: 10.1038/s41586-018-0337-2 – ident: 3180_CR19 doi: 10.1038/s41467-024-45914-8 – volume: 5 year: 2019 ident: 3180_CR23 publication-title: npj Comput. Mater. doi: 10.1038/s41524-019-0204-1 – volume: 27 start-page: 1 year: 2023 ident: 3180_CR16 publication-title: Crit. Care doi: 10.1186/s13054-023-04393-x – ident: 3180_CR48 – ident: 3180_CR17 doi: 10.1021/jacs.3c05819 – ident: 3180_CR27 – year: 2023 ident: 3180_CR42 doi: 10.57760/sciencedb.13293 – ident: 3180_CR31 doi: 10.18653/v1/D19-1371 – ident: 3180_CR30 – volume: 43 start-page: 631 year: 2014 ident: 3180_CR4 publication-title: Chem. Soc. Rev. doi: 10.1039/C3CS60323G – volume: 12 start-page: 2825 year: 2011 ident: 3180_CR49 publication-title: J. Mach. Learn. Res. – volume: 7 year: 2020 ident: 3180_CR9 publication-title: Sci. Data doi: 10.1038/s41597-020-00602-2 – volume: 33 start-page: 1877 year: 2020 ident: 3180_CR28 publication-title: Advances in neural information processing systems – ident: 3180_CR44 doi: 10.18653/v1/P16-2067 – ident: 3180_CR14 doi: 10.7759/cureus.35179 – ident: 3180_CR47 doi: 10.3115/1220575.1220634 – volume: 10 year: 2023 ident: 3180_CR12 publication-title: Sci. Data doi: 10.1038/s41597-023-02089-z – volume: 25 start-page: 2983 year: 2009 ident: 3180_CR35 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp535 – volume: 56 start-page: 1894 year: 2016 ident: 3180_CR37 publication-title: J. Chem. Inf. Model. doi: 10.1021/acs.jcim.6b00207 – volume: 1 start-page: 9 year: 2019 ident: 3180_CR29 publication-title: OpenAI blog – ident: 3180_CR18 – ident: 3180_CR39 – volume: 60 start-page: 2876 year: 2020 ident: 3180_CR20 publication-title: J. Chem. Inf. Model doi: 10.1021/acs.jcim.0c00199 – volume: 45 start-page: 5 year: 2001 ident: 3180_CR25 publication-title: Mach. Learn. doi: 10.1023/A:1010933404324 – ident: 3180_CR46 – ident: 3180_CR21 – year: 2023 ident: 3180_CR41 doi: 10.57760/sciencedb.13292 – volume: 3 start-page: 993 year: 2003 ident: 3180_CR24 publication-title: J. Mach. Learn. Res. – volume: 13 start-page: 8525 year: 2023 ident: 3180_CR3 publication-title: ACS Catal. doi: 10.1021/acscatal.3c00759 – ident: 3180_CR38 doi: 10.18653/v1/D15-1162 – volume: 34 start-page: 979 year: 2022 ident: 3180_CR10 publication-title: Chem. Mater. doi: 10.1021/acs.chemmater.1c02961 – volume: 9 year: 2022 ident: 3180_CR22 publication-title: Sci. Data doi: 10.1038/s41597-022-01321-6 |
SSID | ssj0001340570 |
Score | 2.3221073 |
Snippet | CO
2
electroreduction has garnered significant attention from both the academic and industrial communities. Extracting crucial information related to catalysts... CO2 electroreduction has garnered significant attention from both the academic and industrial communities. Extracting crucial information related to catalysts... Abstract CO2 electroreduction has garnered significant attention from both the academic and industrial communities. Extracting crucial information related to... |
SourceID | doaj pubmedcentral proquest crossref springer |
SourceType | Open Website Open Access Repository Aggregation Database Index Database Publisher |
StartPage | 347 |
SubjectTerms | 639/301/299/886 639/301/299/890 Algorithms Artificial intelligence Carbon dioxide Catalysis Catalysts Chatbots Data Descriptor Data mining Datasets Humanities and Social Sciences Information processing Language Large language models Machine learning Metadata multidisciplinary Natural language processing Science Science (multidisciplinary) Scientists Subject specialists |
SummonAdditionalLinks | – databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3NS-UwEB_E017EjxXrF1nYg6LFNklfk6PKiiyrXhS8haZJ0EufmPcO_vfOpH1qhcWL0FPT0jQzk5lfJvkNwG-ta1so3eQ6CJ5LX6JJ4aSYNzy41ssi6MS2f3U9ubyTf--r-w-lvmhPWE8P3A_cCZ289KUPtVWIRaxulK5EzS1vrVdOBZp90ed9AFNpdUVQIFIMp2QKoU4ieioiHuUyJz0ucj3yRImwfxRlft4j-SlRmvzPxSqsDIEjO-07vAZLvluHtcE0IzsY-KMPN8D8o93dbLESyVKxG-a7h5TsZwg3n-aRTQM7v-HsmahbSThsKIiT1nNe4iyypnMsvnQYIcbHyJKnc3NE5z_h7uLP7fllPtRRyFsEv7O8qnmDqIXgpws1AjpdCuelryriJysdJ1bAwlMp9VB7KjAliITHWpxFK6sKsQnL3bTzW8CUq6ydNNZRepUrawtpEV4HnPHb0EqfwdFiTM1TT5dhUppbKNNLwKAETJKA0Rmc0bC_PUlU1-kGKoAZFMB8pQAZ7C6EZgb7iwYds9B0yL_M4NdbM1oOpUOazk_n6RnJ5UQpkYEaCXvUoXFL9_iQOLgpoYhvygyOF3rx_vX___H2d_zxDvzgSY_xmuzC8ux57vcwNJrZ_WQFrzBTCow priority: 102 providerName: Directory of Open Access Journals – databaseName: Health & Medical Collection dbid: 7X7 link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1Lb9QwEB5BuXBBlIcItMhIHEAQNbGdjX1CUFFViMeFSnuz4timvSTLevfQf98Zr9NVKoGUU-woiefhefkbgLdat7ZSuit1ELyUvkaRQqVYdjy43ssq6IS2_-Pn4vxCfls2yxxwi7msctKJSVG7sacY-QmqWqHp2Hb9afW3pK5RlF3NLTTuwwOCLqOSrnbZ7mMsgsyRKp-VqYQ6ibhfEfwolyVxc1Xq2X6UYPtntubdSsk76dK0C509hkfZfGSfd_Q-hHt-eAKHWUAje5dRpN8_BfOdarzZFI9kqeUN88NlSvkzdDpX28jGwE5_cbYmAFciEcttcVJU5zpuIusGx-L1gHZivIos7Xduiz76M7g4-_r79LzM3RTKHl3gTdm0vEPfhZxQF1p063QtnJe-aQilrHacsAErTw3VQ-upzZQgKB5rUZc2VlXiORwM4-BfAFOusXbRWUdJVq6sraRFJzug3u9DL30BH6Y1NasdaIZJyW6hzI4CBilgEgWMLuALLfvtTAK8TjfG9R-T5cfQAVxf-9BahS6p1Z3SjWi55b31yqlQwNFENJOlMJo9zxTw5nYY5YeSIt3gx22aI7lcKCUKUDNizz5oPjJcXSYkbkor4pOygI8TX-zf_u8_fvn_j30FD3niULwWR3CwWW_9MZo-G_s68fcNdiQBrQ priority: 102 providerName: ProQuest – databaseName: Scholars Portal Journals: Open Access dbid: M48 link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Nb9QwEB2VcuGCWj5EaEFG4gCCQGI7G_uAEFRUFaJwYaXerDi2aSWULetdif33zHidolTlgJRTYiuJZ8Yzz2O_AXiudWsrpbtSB8FL6Ws0KZwUy44H13tZBZ3Y9k-_zk7m8vNZc7YDY7mjPIDxRmhH9aTmy59vfv_avEeDf7c9Mq7eRnRCxCnKZUkqWpX6FtxGz9SSoZ7mcD-tuQgKT6p8dubmrhP_lGj8J7Hn9Z2T19KnySsd78HdHE6yD1v578OOH-7BfjbYyF5kVumX98F8oT3fbFyfZKkEDvPDedoCwBCEXq4jWwR29I2zJRG6kshYLpOTVnk2cRVZNzgWNwPGjfEisuT_3Box-wOYH3_6fnRS5uoKZY-QeFU2Le8QyxAodaFFmKdr4bz0TUOsZbXjxBVYeSqwHlpPZacEUfNYi3NrY1UlHsLusBj8I2DKNdbOOuso6cqVtZW0CLoD-oE-9NIX8GocU3O5JdEwKfktlNlKwKAETJKA0QV8pGG_akkE2OnGYvnDZHsydCDX1z60ViFEtbpTuhEtt7y3XjkVCjgchWZGpTLoroWmo_91Ac-uHqM9UZKkG_xindpILmdKiQLURNiTD5o-GS7OEzM3pRmxpyzg9agXf9_-7z9-_H_ND-AOTxqL1-wQdlfLtX-CodHKPk36_gcJuwkh priority: 102 providerName: Scholars Portal |
Title | Large language model enhanced corpus of CO2 reduction electrocatalysts and synthesis procedures |
URI | https://link.springer.com/article/10.1038/s41597-024-03180-9 https://www.proquest.com/docview/3033931671 https://www.proquest.com/docview/3034246883 https://pubmed.ncbi.nlm.nih.gov/PMC10998834 https://doaj.org/article/9956e1ef7b8840b9a895372b2cbe8d8f |
Volume | 11 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1La9wwEB7yuPRSmj6o22RRoYeW1tSWZFs6bpaEsDRpaRvYm7BsKcnFG1a7h_z7zmjtBIf0UDA2WBKWNRrNS_oG4KPWlc2UrlPtBU-ly5GlcFFMa-7bxsnM64i2f35Rnl3K-aJY7AAfzsLETfsR0jIu08PusG8BBQ3hhnKZ0jTMUr0L-wTdTrN6Vs4e_CqCVJCsPx-TCfVE05EMilD9I_3y8e7IRyHSKHlOX8DzXmVk020nD2DHdS_hoGfKwD71yNGfX4H5Tvu62eCDZDHNDXPddQzzMzQ0bzeBLT2b_eBsRaCtRBbWp8KJnpy7sA6s7loW7jrUDcNNYFHGtThW4TVcnp78mZ2lfQaFtEGzd50WFa_RXiHDs_UVmnI6F62TrigImSxvOeEBZo6SqPvKUWopQfA71uL6WViViTew1y079xaYagtry9q2FFjlytpMWjSsPa71jW-kS-DLMKbmdguUYWKAWyizpYBBCphIAaMTOKZhv69JINfxxXJ1ZXqiGzp063LnK6vQDLW6VroQFbe8sU61yidwOBDN9JwXDIpkoel4f57Ah_ti5BkKhNSdW25iHcllqZRIQI2IPerQuKS7uY7o2xRKxJYyga_DvHj4-r__-N3_VX8Pz3icsXiVh7C3Xm3cEao_azuB3WpRTWB_Op3_nuPz-OTi569J5IJJdCng_Vyqv4j6BV8 |
linkProvider | Springer Nature |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3Nb9MwFH8a3QEuiPEhwgYYCSQQREtsp7EPCLGxqWNdQWiTdjNx4rBd0q5uhfpP8Tfynpus6iS4TcopdhLH79vP_j2A11rnNlG6iHUteCxdiiKFSjEueF2VTia1Dmj7J6P-4Ex-Pc_ON-BPdxaGtlV2OjEo6mpc0hr5LqpaoenYdvppchVT1SjKrnYlNJZscewWvzFk8x-PviB933B-eHC6P4jbqgJxiaHgLM5yXqAPT8FYVecY3uhUVE66LCO0rrTihJGXOCosXueOyi0JgqSxFnVKZlUi8L13YFMKDGV6sLl3MPr-Y7WqI8gBStrTOYlQux4tJAGechmT_CSxXrOAoVDAmnd7c2_mjQRtsHuHD-B-67Cyz0sO24IN1zyErVYlePa2xa1-9wjMkHaVs24FlIUiO8w1F2GTAcMwdzL3bFyz_W-cTQkylpiCtYV4wjrSws88K5qK-UWDnqm_9CxY2Go-df4xnN3KTD-BXjNu3FNgqsqs7Re2orQuV9Ym0mJYX6OlKetSugjed3NqJkuYDhPS60KZJQUMUsAEChgdwR5N-3VPgtgON8bTX6aVWENHfl3q6twqDIKtLpTORM4tL61Tlaoj2OmIZlq592bFpRG8um5GiaU0TNG48Tz0kVz2lRIRqDVirw1ovaW5vAjY35TIxCdlBB86vlh9_d9__Oz_g30JdwenJ0MzPBodb8M9HrgVr_4O9GbTuXuOjtfMvmi5ncHP2xawv2byPDA |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB6VIiEuiPIQgQJGAgkE0SZ2srEPCEHLqqWlcKDS3tw4sWkvybLeFdq_xq9jxpt0lUpwq5RTno7n7Rl_A_BSqcIkUpWxcoLHmU1RpFApxiV3dWWzxKmAtv_1ZHxwmn2Z5tMt-NPvhaGyyl4nBkVdtxWtkY9Q1QpF27bTkevKIr7vTz7MfsXUQYoyrX07jTWLHNnVbwzf_PvDfaT1K84nn3_sHcRdh4G4wrBwEecFL9Gfp8CsdgWGOioVtc1snhNyV1pzwstLLDUZd4Wl1kuC4GmMQf2SG5kIfO8NuFmIPCUZK6bFZn1HkCuUdPt0EiFHHm0lQZ_yLCZJSmI1sIWhZcDAz71apXklVRss4OQu3OlcV_ZxzWs7sGWbe7DTKQfPXncI1m_ugz6m-nLWr4Wy0G6H2eY8lBswDHhnS89ax_a-cTYn8FhiD9a15AkrSiu_8KxsauZXDfqo_sKzYGvr5dz6B3B6LfP8ELabtrGPgMk6N2ZcmpoSvFwak2QGA3yHNqdyVWYjeNvPqZ6tATt0SLQLqdcU0EgBHSigVQSfaNov7ySw7XCinf_Unexq2vxrU-sKIzEcNqqUKhcFN7wyVtbSRbDbE013GsDrDb9G8OLyMsouJWTKxrbLcE_Gs7GUIgI5IPZgQMMrzcV5QAGnlCY-mUXwrueLzdf__ceP_z_Y53ALxUofH54cPYHbPDArHuNd2F7Ml_YpemAL8yywOoOz65atv9DPPwA |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Large+language+model+enhanced+corpus+of+CO2+reduction+electrocatalysts+and+synthesis+procedures&rft.jtitle=Scientific+data&rft.au=Chen%2C+Xueqing&rft.au=Gao%2C+Yang&rft.au=Wang%2C+Ludi&rft.au=Cui%2C+Wenjuan&rft.date=2024-04-06&rft.pub=Nature+Publishing+Group+UK&rft.eissn=2052-4463&rft.volume=11&rft.issue=1&rft_id=info:doi/10.1038%2Fs41597-024-03180-9&rft.externalDocID=10_1038_s41597_024_03180_9 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2052-4463&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2052-4463&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2052-4463&client=summon |