Machine Learning Using Combined Structural and Chemical Descriptors for Prediction of Methane Adsorption Performance of Metal Organic Frameworks (MOFs)

Using molecular simulation for adsorbent screening is computationally expensive and thus prohibitive to materials discovery. Machine learning (ML) algorithms trained on fundamental material properties can potentially provide quick and accurate methods for screening purposes. Prior efforts have focus...

Full description

Saved in:
Bibliographic Details
Published inACS combinatorial science Vol. 19; no. 10; pp. 640 - 645
Main Authors Pardakhti, Maryam, Moharreri, Ehsan, Wanik, David, Suib, Steven L, Srivastava, Ranjan
Format Journal Article
LanguageEnglish
Published United States American Chemical Society 09.10.2017
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Using molecular simulation for adsorbent screening is computationally expensive and thus prohibitive to materials discovery. Machine learning (ML) algorithms trained on fundamental material properties can potentially provide quick and accurate methods for screening purposes. Prior efforts have focused on structural descriptors for use with ML. In this work, the use of chemical descriptors, in addition to structural descriptors, was introduced for adsorption analysis. Evaluation of structural and chemical descriptors coupled with various ML algorithms, including decision tree, Poisson regression, support vector machine and random forest, were carried out to predict methane uptake on hypothetical metal organic frameworks. To highlight their predictive capabilities, ML models were trained on 8% of a data set consisting of 130,398 MOFs and then tested on the remaining 92% to predict methane adsorption capacities. When structural and chemical descriptors were jointly used as ML input, the random forest model with 10-fold cross validation proved to be superior to the other ML approaches, with an R 2 of 0.98 and a mean absolute percent error of about 7%. The training and prediction using the random forest algorithm for adsorption capacity estimation of all 130,398 MOFs took approximately 2 h on a single personal computer, several orders of magnitude faster than actual molecular simulations on high-performance computing clusters.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2156-8952
2156-8944
2156-8944
DOI:10.1021/acscombsci.7b00056