Out-of-Distribution Generalization With Causal Feature Separation

Driven by empirical risk minimization, machine learning algorithm tends to exploit subtle statistical correlations existing in the training environment for prediction, while the spurious correlations are unstable across environments, leading to poor generalization performance. Accordingly, the probl...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on knowledge and data engineering Vol. 36; no. 4; pp. 1758 - 1772
Main Authors Wang, Haotian, Kuang, Kun, Lan, Long, Wang, Zige, Huang, Wanrong, Wu, Fei, Yang, Wenjing
Format Journal Article
LanguageEnglish
Published New York IEEE 01.04.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Driven by empirical risk minimization, machine learning algorithm tends to exploit subtle statistical correlations existing in the training environment for prediction, while the spurious correlations are unstable across environments, leading to poor generalization performance. Accordingly, the problem of the Out-of-distribution (OOD) generalization aims to exploit an invariant/stable relationship between features and outcomes that generalizes well on all possible environments. To address the spurious correlation induced by the selection bias, in this article, we propose a novel Clique-based Causal Feature Separation (CCFS) algorithm by explicitly incorporating the causal structure to identify causal features of outcome for OOD generalization. Specifically, the proposed CCFS algorithm identifies the largest clique in the learned causal skeleton. Theoretically, we guarantee that either the largest clique or the rest of the causal skeleton is exactly the set of all causal features of the outcome. Finally, we separate the causal features from the non-causal ones with a sample-reweighting decorrelator for OOD prediction. Extensive experiments validate the effectiveness of the proposed CCFS method on both causal feature identification and OOD generalization tasks.
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2023.3312255