Predicting molecular dipole moments by combining atomic partial charges and atomic dipoles

The molecular dipole moment (μ) is a central quantity in chemistry. It is essential in predicting infrared and sum-frequency generation spectra as well as induction and long-range electrostatic interactions. Furthermore, it can be extracted directly—via the ground state electron density—from high-le...

Full description

Saved in:
Bibliographic Details
Published inThe Journal of chemical physics Vol. 153; no. 2; pp. 024113 - 24126
Main Authors Veit, Max, Wilkins, David M., Yang, Yang, DiStasio, Robert A., Ceriotti, Michele
Format Journal Article
LanguageEnglish
Published Melville American Institute of Physics 14.07.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The molecular dipole moment (μ) is a central quantity in chemistry. It is essential in predicting infrared and sum-frequency generation spectra as well as induction and long-range electrostatic interactions. Furthermore, it can be extracted directly—via the ground state electron density—from high-level quantum mechanical calculations, making it an ideal target for machine learning (ML). In this work, we choose to represent this quantity with a physically inspired ML model that captures two distinct physical effects: local atomic polarization is captured within the symmetry-adapted Gaussian process regression framework which assigns a (vector) dipole moment to each atom, while the movement of charge across the entire molecule is captured by assigning a partial (scalar) charge to each atom. The resulting “MuML” models are fitted together to reproduce molecular μ computed using high-level coupled-cluster theory and density functional theory (DFT) on the QM7b dataset, achieving more accurate results due to the physics-based combination of these complementary terms. The combined model shows excellent transferability when applied to a showcase dataset of larger and more complex molecules, approaching the accuracy of DFT at a small fraction of the computational cost. We also demonstrate that the uncertainty in the predictions can be estimated reliably using a calibrated committee model. The ultimate performance of the models—and the optimal weighting of their combination—depends, however, on the details of the system at hand, with the scalar model being clearly superior when describing large molecules whose dipole is almost entirely generated by charge separation. These observations point to the importance of simultaneously accounting for the local and non-local effects that contribute to μ; furthermore, they define a challenging task to benchmark future models, particularly those aimed at the description of condensed phases.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
USDOE
AC02-05CH11231
ISSN:0021-9606
1089-7690
1089-7690
DOI:10.1063/5.0009106