Predicting band gaps of MOFs on small data by deep transfer learning with data augmentation strategies

Porphyrin-based MOFs combine the unique photophysical and electrochemical properties of metalloporphyrins with the catalytic efficiency of MOF materials, making them an important candidate for light energy harvesting and conversion. However, accurate prediction of the band gap of porphyrin-based MOF...

Full description

Saved in:
Bibliographic Details
Published inRSC advances Vol. 13; no. 25; pp. 16952 - 16962
Main Authors Zhang, Zhihui, Zhang, Chengwei, Zhang, Yutao, Deng, Shengwei, Yang, Yun-Fang, Su, An, She, Yuan-Bin
Format Journal Article
LanguageEnglish
Published England Royal Society of Chemistry 05.06.2023
The Royal Society of Chemistry
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Porphyrin-based MOFs combine the unique photophysical and electrochemical properties of metalloporphyrins with the catalytic efficiency of MOF materials, making them an important candidate for light energy harvesting and conversion. However, accurate prediction of the band gap of porphyrin-based MOFs is hampered by their complex structure-function relationships. Although machine learning (ML) has performed well in predicting the properties of MOFs with large training datasets, such ML applications become challenging when the training data size of the materials is small. In this study, we first constructed a dataset of 202 porphyrin-based MOFs using DFT computations and increased the training data size using two data augmentation strategies. After that, four state-of-the-art neural network models were pre-trained with the recognized open-source database QMOF and fine-tuned with our augmented self-curated datasets. The GCN models predicted the band gaps of the porphyrin-based materials with the lowest RMSE of 0.2767 eV and MAE of 0.1463 eV. In addition, the data augmentation strategy rotation and mirroring effectively decreased the RMSE by 38.51% and MAE by 50.05%. This study demonstrates that, when proper transfer learning and data augmentation strategies are applied, machine learning models can predict the properties of MOFs using small training data. Pretrained deep learning models are fine-tuned by our porphyrin-based MOF database using data augmentation strategies to demonstrate how deep transfer learning can predict the properties of MOFs with limited training data.
Bibliography:https://doi.org/10.1039/d3ra02142d
Electronic supplementary information (ESI) available. See DOI
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
These authors contributed equally to this work.
ISSN:2046-2069
2046-2069
DOI:10.1039/d3ra02142d