The impact of domain-driven and data-driven feature selection on the inverse design of nanoparticle catalysts
Incorporating practical considerations into machine learning can make predictions more actionable. However, researcher interventions in the learning process may have negative impacts on model performance, leading to a trade-off between accuracy and utility. In this paper we use multi-target machine...
Saved in:
Published in | Journal of computational science Vol. 65; p. 101896 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
01.11.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Incorporating practical considerations into machine learning can make predictions more actionable. However, researcher interventions in the learning process may have negative impacts on model performance, leading to a trade-off between accuracy and utility. In this paper we use multi-target machine learning to predict the structure of platinum nanocatalysts based on property indicators and develop intervention scenarios using ratios of data-driven (optimal) and domain-driven (preferable) variables during feature selection. We show that minor interventions to data-driven feature selection can be tolerated, and even improve model performance, but aggressive domain-driven feature selection degrades performance, even if the mapping function is perfectly balanced.
[Display omitted]
•Multi-target prediction of catalysis properties with data-driven feature selection.•Inversely predicting multiple physicochemical features using subsets of properties.•Quantified impact of domain-driven feature selection on the predictive models. |
---|---|
ISSN: | 1877-7503 1877-7511 |
DOI: | 10.1016/j.jocs.2022.101896 |