MonoNet: Towards Interpretable Models by Learning Monotonic Features
Being able to interpret, or explain, the predictions made by a machine learning model is of fundamental importance. This is especially true when there is interest in deploying data-driven models to make high-stakes decisions, e.g. in healthcare. While recent years have seen an increasing interest in...
Saved in:
Main Authors | , |
---|---|
Format | Journal Article |
Language | English |
Published |
30.09.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Being able to interpret, or explain, the predictions made by a machine
learning model is of fundamental importance. This is especially true when there
is interest in deploying data-driven models to make high-stakes decisions, e.g.
in healthcare. While recent years have seen an increasing interest in
interpretable machine learning research, this field is currently lacking an
agreed-upon definition of interpretability, and some researchers have called
for a more active conversation towards a rigorous approach to interpretability.
Joining this conversation, we claim in this paper that the difficulty of
interpreting a complex model stems from the existing interactions among
features. We argue that by enforcing monotonicity between features and outputs,
we are able to reason about the effect of a single feature on an output
independently from other features, and consequently better understand the
model. We show how to structurally introduce this constraint in deep learning
models by adding new simple layers. We validate our model on benchmark
datasets, and compare our results with previously proposed interpretable
models. |
---|---|
DOI: | 10.48550/arxiv.1909.13611 |