Conformalization of Sparse Generalized Linear Models
Given a sequence of observable variables $\{(x_1, y_1), \ldots, (x_n, y_n)\}$, the conformal prediction method estimates a confidence set for $y_{n+1}$ given $x_{n+1}$ that is valid for any finite sample size by merely assuming that the joint distribution of the data is permutation invariant. Althou...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
11.07.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Given a sequence of observable variables $\{(x_1, y_1), \ldots, (x_n,
y_n)\}$, the conformal prediction method estimates a confidence set for
$y_{n+1}$ given $x_{n+1}$ that is valid for any finite sample size by merely
assuming that the joint distribution of the data is permutation invariant.
Although attractive, computing such a set is computationally infeasible in most
regression problems. Indeed, in these cases, the unknown variable $y_{n+1}$ can
take an infinite number of possible candidate values, and generating conformal
sets requires retraining a predictive model for each candidate. In this paper,
we focus on a sparse linear model with only a subset of variables for
prediction and use numerical continuation techniques to approximate the
solution path efficiently. The critical property we exploit is that the set of
selected variables is invariant under a small perturbation of the input data.
Therefore, it is sufficient to enumerate and refit the model only at the change
points of the set of active features and smoothly interpolate the rest of the
solution via a Predictor-Corrector mechanism. We show how our path-following
algorithm accurately approximates conformal prediction sets and illustrate its
performance using synthetic and real data examples. |
---|---|
DOI: | 10.48550/arxiv.2307.05109 |