Rethinking Distance Metrics for Counterfactual Explainability

Counterfactual explanations have been a popular method of post-hoc explainability for a variety of settings in Machine Learning. Such methods focus on explaining classifiers by generating new data points that are similar to a given reference, while receiving a more desirable prediction. In this work...

Full description

Saved in:
Bibliographic Details
Main Authors Williams, Joshua Nathaniel, Katakkar, Anurag, Heidari, Hoda, Kolter, J. Zico
Format Journal Article
LanguageEnglish
Published 18.10.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Counterfactual explanations have been a popular method of post-hoc explainability for a variety of settings in Machine Learning. Such methods focus on explaining classifiers by generating new data points that are similar to a given reference, while receiving a more desirable prediction. In this work, we investigate a framing for counterfactual generation methods that considers counterfactuals not as independent draws from a region around the reference, but as jointly sampled with the reference from the underlying data distribution. Through this framing, we derive a distance metric, tailored for counterfactual similarity that can be applied to a broad range of settings. Through both quantitative and qualitative analyses of counterfactual generation methods, we show that this framing allows us to express more nuanced dependencies among the covariates.
DOI:10.48550/arxiv.2410.14522