Tree based machine learning framework for predicting ground state energies of molecules
We present an application of the boosted regression tree algorithm for predicting ground state energies of molecules made up of C, H, N, O, P, and S (CHNOPS). The PubChem chemical compound database has been incorporated to construct a dataset of 16 242 molecules, whose electronic ground state energi...
Saved in:
Published in | The Journal of chemical physics Vol. 145; no. 13; p. 134101 |
---|---|
Main Author | |
Format | Journal Article |
Language | English |
Published |
United States
07.10.2016
|
Online Access | Get more information |
Cover
Loading…
Abstract | We present an application of the boosted regression tree algorithm for predicting ground state energies of molecules made up of C, H, N, O, P, and S (CHNOPS). The PubChem chemical compound database has been incorporated to construct a dataset of 16 242 molecules, whose electronic ground state energies have been computed using density functional theory. This dataset is used to train the boosted regression tree algorithm, which allows a computationally efficient and accurate prediction of molecular ground state energies. Predictions from boosted regression trees are compared with neural network regression, a widely used method in the literature, and shown to be more accurate with significantly reduced computational cost. The performance of the regression model trained using the CHNOPS set is also tested on a set of distinct molecules that contain additional Cl and Si atoms. It is shown that the learning algorithms lead to a rich and diverse possibility of applications in molecular discovery and materials informatics. |
---|---|
AbstractList | We present an application of the boosted regression tree algorithm for predicting ground state energies of molecules made up of C, H, N, O, P, and S (CHNOPS). The PubChem chemical compound database has been incorporated to construct a dataset of 16 242 molecules, whose electronic ground state energies have been computed using density functional theory. This dataset is used to train the boosted regression tree algorithm, which allows a computationally efficient and accurate prediction of molecular ground state energies. Predictions from boosted regression trees are compared with neural network regression, a widely used method in the literature, and shown to be more accurate with significantly reduced computational cost. The performance of the regression model trained using the CHNOPS set is also tested on a set of distinct molecules that contain additional Cl and Si atoms. It is shown that the learning algorithms lead to a rich and diverse possibility of applications in molecular discovery and materials informatics. |
Author | Himmetoglu, Burak |
Author_xml | – sequence: 1 givenname: Burak surname: Himmetoglu fullname: Himmetoglu, Burak organization: Center for Scientific Computing, University of California, Santa Barbara, California 93106, USA and Enterprise Technology Services, University of California, Santa Barbara, California 93106, USA |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/27782427$$D View this record in MEDLINE/PubMed |
BookMark | eNo1j8tKxDAYRoMozkUXvoDkBTrm1iZZyuANBtwUXA65_KnVNilJi_j2KurqWxw4nG-DTmOKgNAVJTtKGn5Dd0I3gmh-gtaUKF3JRpMV2pTyRgihkolztGJSKiaYXKOXNgNgawp4PBr32kfAA5gc-9jhkM0IHym_45AynjL43s0_oMtpiR6X2cyAIULueig4BTymAdwyQLlAZ8EMBS7_dova-7t2_1gdnh-e9reHynHF56ppBK2tpZwpYr7jWBDUGieZJ85oAK9rUQsrQ6idUlrLYDU3Umpbe6FqtkXXv9ppsSP445T70eTP4_9B9gWVPlHc |
CitedBy_id | crossref_primary_10_1093_nar_gkac956 crossref_primary_10_1002_ange_201910283 crossref_primary_10_1016_j_carbon_2021_11_073 crossref_primary_10_1021_acs_jpcb_7b08707 crossref_primary_10_1016_j_knosys_2019_105326 crossref_primary_10_1021_acs_jpca_0c03926 crossref_primary_10_1039_D2CS00203E crossref_primary_10_1021_acs_jctc_3c01252 crossref_primary_10_1088_1361_648X_ac3a85 crossref_primary_10_1021_acs_jctc_8b00788 crossref_primary_10_1103_PhysRevMaterials_3_063801 crossref_primary_10_3938_jkps_77_680 crossref_primary_10_1002_anie_201910283 crossref_primary_10_1088_1742_6596_2072_1_012005 crossref_primary_10_1103_PhysRevB_102_075409 crossref_primary_10_1021_acs_jpcc_8b03405 crossref_primary_10_1039_D0CP03694C |
ContentType | Journal Article |
DBID | NPM |
DOI | 10.1063/1.4964093 |
DatabaseName | PubMed |
DatabaseTitle | PubMed |
DatabaseTitleList | PubMed |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database |
DeliveryMethod | no_fulltext_linktorsrc |
Discipline | Chemistry Physics |
EISSN | 1089-7690 |
ExternalDocumentID | 27782427 |
Genre | Journal Article |
GroupedDBID | --- -DZ -ET -~X 123 1UP 2-P 29K 4.4 53G 5VS 85S AAAAW AABDS AAEUA AAPUP AAYIH ABPPZ ABRJW ABZEH ACBRY ACLYJ ACNCT ACZLF ADCTM AEJMO AENEX AFATG AFHCQ AGKCL AGLKD AGMXG AGTJO AHSDT AJJCW AJQPL ALEPV ALMA_UNASSIGNED_HOLDINGS AQWKA ATXIE AWQPM BDMKI BPZLN CS3 D-I DU5 EBS EJD ESX F5P FDOHQ FFFMQ HAM M6X M71 M73 N9A NPM NPSNA O-B P2P RIP RNS RQS TN5 TWZ UPT WH7 YQT YZZ ~02 |
ID | FETCH-LOGICAL-c383t-66415bb13280a1722f41bac72d0ca9eed95454b7ff5c88997fb93a779b5d4852 |
IngestDate | Sat Sep 28 07:59:48 EDT 2024 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 13 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c383t-66415bb13280a1722f41bac72d0ca9eed95454b7ff5c88997fb93a779b5d4852 |
PMID | 27782427 |
ParticipantIDs | pubmed_primary_27782427 |
PublicationCentury | 2000 |
PublicationDate | 2016-Oct-07 |
PublicationDateYYYYMMDD | 2016-10-07 |
PublicationDate_xml | – month: 10 year: 2016 text: 2016-Oct-07 day: 07 |
PublicationDecade | 2010 |
PublicationPlace | United States |
PublicationPlace_xml | – name: United States |
PublicationTitle | The Journal of chemical physics |
PublicationTitleAlternate | J Chem Phys |
PublicationYear | 2016 |
SSID | ssj0001724 |
Score | 2.3804512 |
Snippet | We present an application of the boosted regression tree algorithm for predicting ground state energies of molecules made up of C, H, N, O, P, and S (CHNOPS).... |
SourceID | pubmed |
SourceType | Index Database |
StartPage | 134101 |
Title | Tree based machine learning framework for predicting ground state energies of molecules |
URI | https://www.ncbi.nlm.nih.gov/pubmed/27782427 |
Volume | 145 |
hasFullText | |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ3JTsMwEIYtFiG4ICj7Jh-4VSl1Ysf2EVWgComKQxHcUJzYFUsXVe2Fp2dsx2lVFgGXKI3bqpnPtX9PxjMInYNCgIktIxFRWkZUUB1JruAlV5wQI2OjrL_jtpO27-nNI3uclVV0u0smqpG_f7mv5D9U4Rpwtbtk_0C2-lK4AOfAF45AGI6_YzzWum7noaLed0GROlSB6NVNiLpygYSjsX0g40Kc7T4O6yy3KrNuk07bRBHuMbuvlFsGFb7MutGcaM1DfgHvEakEefu539eTYe9t6jrMdJy9zvsTSOoi0_ycp_0Y2BQy4qmv4lkNkj7pY-gNydyYZ1PCeYfEp-EY9I_1DDSoTGEdmcy_Byw56jsuMQeVQn2GgJ9bFzJjh6ZltMyFHeM61lNTzsIgzGjIJJUmF9VvsNmfy88trCScouhuoc3SqvjSc91GS3pQQ-utUIGvhtbuvJF30IMljR1pXJLGgTSuSGMgjWeksSeNHWkcSOOhwRXpXdS9vuq22lFZEiPKE5FMojQFwaUUSWLRzOAWY0OJynIeF808k6B3JChiqrgxLBewlOZGySTjXCpWUMHiPbQyGA70AcKkYEkBi02lGPAjTPCYqWaqYypIxhQ_RPveOE8jn_bkKZjt6NuWY7Qx61EnaNXA_0yfgmibqDNH5wMcXUH6 |
link.rule.ids | 783 |
linkProvider | National Library of Medicine |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Tree+based+machine+learning+framework+for+predicting+ground+state+energies+of+molecules&rft.jtitle=The+Journal+of+chemical+physics&rft.au=Himmetoglu%2C+Burak&rft.date=2016-10-07&rft.eissn=1089-7690&rft.volume=145&rft.issue=13&rft.spage=134101&rft_id=info:doi/10.1063%2F1.4964093&rft_id=info%3Apmid%2F27782427&rft_id=info%3Apmid%2F27782427&rft.externalDocID=27782427 |