The impact of domain-driven and data-driven feature selection on the inverse design of nanoparticle catalysts

Incorporating practical considerations into machine learning can make predictions more actionable. However, researcher interventions in the learning process may have negative impacts on model performance, leading to a trade-off between accuracy and utility. In this paper we use multi-target machine...

Full description

Saved in:
Bibliographic Details
Published inJournal of computational science Vol. 65; p. 101896
Main Authors Li, Sichao, Ting, Jonathan Y.C., Barnard, Amanda S.
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.11.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Incorporating practical considerations into machine learning can make predictions more actionable. However, researcher interventions in the learning process may have negative impacts on model performance, leading to a trade-off between accuracy and utility. In this paper we use multi-target machine learning to predict the structure of platinum nanocatalysts based on property indicators and develop intervention scenarios using ratios of data-driven (optimal) and domain-driven (preferable) variables during feature selection. We show that minor interventions to data-driven feature selection can be tolerated, and even improve model performance, but aggressive domain-driven feature selection degrades performance, even if the mapping function is perfectly balanced. [Display omitted] •Multi-target prediction of catalysis properties with data-driven feature selection.•Inversely predicting multiple physicochemical features using subsets of properties.•Quantified impact of domain-driven feature selection on the predictive models.
ISSN:1877-7503
1877-7511
DOI:10.1016/j.jocs.2022.101896