Deep Discriminative to Kernel Density Graph for In- and Out-of-distribution Calibrated Inference
Deep discriminative approaches like random forests and deep neural networks have recently found applications in many important real-world scenarios. However, deploying these learning algorithms in safety-critical applications raises concerns, particularly when it comes to ensuring confidence calibra...
Saved in:
Main Authors | , , , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
31.01.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Deep discriminative approaches like random forests and deep neural networks
have recently found applications in many important real-world scenarios.
However, deploying these learning algorithms in safety-critical applications
raises concerns, particularly when it comes to ensuring confidence calibration
for both in-distribution and out-of-distribution data points. Many popular
methods for in-distribution (ID) calibration, such as isotonic and Platt's
sigmoidal regression, exhibit excellent ID calibration performance. However,
these methods are not calibrated for the entire feature space, leading to
overconfidence in the case of out-of-distribution (OOD) samples. On the other
end of the spectrum, existing out-of-distribution (OOD) calibration methods
generally exhibit poor in-distribution (ID) calibration. In this paper, we
address ID and OOD calibration problems jointly. We leveraged the fact that
deep models, including both random forests and deep-nets, learn internal
representations which are unions of polytopes with affine activation functions
to conceptualize them both as partitioning rules of the feature space. We
replace the affine function in each polytope populated by the training data
with a Gaussian kernel. Our experiments on both tabular and vision benchmarks
show that the proposed approaches obtain well-calibrated posteriors while
mostly preserving or improving the classification accuracy of the original
algorithm for ID region, and extrapolate beyond the training data to handle OOD
inputs appropriately. |
---|---|
DOI: | 10.48550/arxiv.2201.13001 |