Assessing the Accuracy of Machine Learning Thermodynamic Perturbation Theory: Density Functional Theory and Beyond

Machine learning thermodynamic perturbation theory (MLPT) is a promising approach to compute finite temperature properties when the goal is to compare several different levels of ab initio theory and/or to apply highly expensive computational methods. Indeed, starting from a production molecular dyn...

Full description

Saved in:
Bibliographic Details
Main Authors Herzog, Basile, da Silva, Mauricio Chagas, Casier, Bastien, Badawi, Michael, Pascale, Fabien, Bucko, Tomas, Lebegue, Sebastien, Rocca, Dario
Format Journal Article
LanguageEnglish
Published 13.10.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Machine learning thermodynamic perturbation theory (MLPT) is a promising approach to compute finite temperature properties when the goal is to compare several different levels of ab initio theory and/or to apply highly expensive computational methods. Indeed, starting from a production molecular dynamics trajectory, this method can estimate properties at one or more target levels of theory from only a small number of additional fixed-geometry calculations, which are used to train a machine learning model. However, as MLPT is based on thermodynamic perturbation theory (TPT), inaccuracies might arise when the starting point trajectory samples a configurational space which has a small overlap with that of the target approximations of interest. By considering case studies of molecules adsorbed in zeolites and several different density functional theory approximations, in this work we assess the accuracy of MLPT for ensemble total energies and enthalpies of adsorption. The problematic cases that were found are analyzed and it is shown that, even without knowing exact reference results, pathological cases for MLPT can be detected by considering a coefficient that measures the statistical imbalance induced by the TPT reweighting. For the most pathological examples we recover target level results within chemical accuracy by applying a machine learning-based Monte Carlo (MLMC) resampling. Finally, based on the ideas developed in this work, we assess and confirm the accuracy of recently published MLPT-based enthalpies of adsorption at the random phase approximation level, whose high computational cost would completely hinder a direct molecular dynamics simulation.
DOI:10.48550/arxiv.2110.06818