Out-of-Distribution Generalization With Causal Feature Separation
Driven by empirical risk minimization, machine learning algorithm tends to exploit subtle statistical correlations existing in the training environment for prediction, while the spurious correlations are unstable across environments, leading to poor generalization performance. Accordingly, the probl...
Saved in:
Published in | IEEE transactions on knowledge and data engineering Vol. 36; no. 4; pp. 1758 - 1772 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.04.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Driven by empirical risk minimization, machine learning algorithm tends to exploit subtle statistical correlations existing in the training environment for prediction, while the spurious correlations are unstable across environments, leading to poor generalization performance. Accordingly, the problem of the Out-of-distribution (OOD) generalization aims to exploit an invariant/stable relationship between features and outcomes that generalizes well on all possible environments. To address the spurious correlation induced by the selection bias, in this article, we propose a novel Clique-based Causal Feature Separation (CCFS) algorithm by explicitly incorporating the causal structure to identify causal features of outcome for OOD generalization. Specifically, the proposed CCFS algorithm identifies the largest clique in the learned causal skeleton. Theoretically, we guarantee that either the largest clique or the rest of the causal skeleton is exactly the set of all causal features of the outcome. Finally, we separate the causal features from the non-causal ones with a sample-reweighting decorrelator for OOD prediction. Extensive experiments validate the effectiveness of the proposed CCFS method on both causal feature identification and OOD generalization tasks. |
---|---|
ISSN: | 1041-4347 1558-2191 |
DOI: | 10.1109/TKDE.2023.3312255 |