Causal Regularization
I argue that regularizing terms in standard regression methods not only help against overfitting finite data, but sometimes also yield better causal models in the infinite sample regime. I first consider a multi-dimensional variable linearly influencing a target variable with some multi-dimensional...
Saved in:
Main Author | |
---|---|
Format | Journal Article |
Language | English |
Published |
28.06.2019
|
Subjects | |
Online Access | Get full text |
DOI | 10.48550/arxiv.1906.12179 |
Cover
Summary: | I argue that regularizing terms in standard regression methods not only help
against overfitting finite data, but sometimes also yield better causal models
in the infinite sample regime. I first consider a multi-dimensional variable
linearly influencing a target variable with some multi-dimensional unobserved
common cause, where the confounding effect can be decreased by keeping the
penalizing term in Ridge and Lasso regression even in the population limit.
Choosing the size of the penalizing term, is however challenging, because cross
validation is pointless. Here it is done by first estimating the strength of
confounding via a method proposed earlier, which yielded some reasonable
results for simulated and real data.
Further, I prove a `causal generalization bound' which states (subject to a
particular model of confounding) that the error made by interpreting any
non-linear regression as causal model can be bounded from above whenever
functions are taken from a not too rich class. In other words, the bound
guarantees "generalization" from observational to interventional distributions,
which is usually not subject of statistical learning theory (and is only
possible due to the underlying symmetries of the confounder model). |
---|---|
DOI: | 10.48550/arxiv.1906.12179 |