It's FLAN time! Summing feature-wise latent representations for interpretability
Interpretability has become a necessary feature for machine learning models deployed in critical scenarios, e.g. legal system, healthcare. In these situations, algorithmic decisions may have (potentially negative) long-lasting effects on the end-user affected by the decision. In many cases, the repr...
Saved in:
Main Authors | , |
---|---|
Format | Journal Article |
Language | English |
Published |
18.06.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Interpretability has become a necessary feature for machine learning models
deployed in critical scenarios, e.g. legal system, healthcare. In these
situations, algorithmic decisions may have (potentially negative) long-lasting
effects on the end-user affected by the decision. In many cases, the
representational power of deep learning models is not needed, therefore simple
and interpretable models (e.g. linear models) should be preferred. However, in
high-dimensional and/or complex domains (e.g. computer vision), the universal
approximation capabilities of neural networks are required. Inspired by linear
models and the Kolmogorov-Arnold representation theorem, we propose a novel
class of structurally-constrained neural networks, which we call FLANs
(Feature-wise Latent Additive Networks). Crucially, FLANs process each input
feature separately, computing for each of them a representation in a common
latent space. These feature-wise latent representations are then simply summed,
and the aggregated representation is used for prediction. These constraints
(which are at the core of the interpretability of linear models) allow a user
to estimate the effect of each individual feature independently from the
others, enhancing interpretability. In a set of experiments across different
domains, we show how without compromising excessively the test performance, the
structural constraints proposed in FLANs indeed facilitates the
interpretability of deep learning models. We quantitatively compare FLANs
interpretability to post-hoc methods using recently introduced metrics,
discussing the advantages of natively interpretable models over a post-hoc
analysis. |
---|---|
DOI: | 10.48550/arxiv.2106.10086 |