GAMI-Net: An explainable neural network based on generalized additive models with structured interactions

•A novel explainable neural network is proposed for modeling main effects and structured interactions.•The GAMI-Net is a disentangled feedforward network with multiple additive subnetworks.•GAMI-Net takes into account three interpretability constraints: sparsity, heredity, marginal clarity.•An adapt...

Full description

Saved in:

Bibliographic Details
Published in	Pattern recognition Vol. 120; p. 108192
Main Authors	Yang, Zebin, Zhang, Aijun, Sudjianto, Agus
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.12.2021
Subjects	Explainable neural network Generalized additive model Interpretability constraints Pairwise interaction Interpretability constraints Pairwise interaction Explainable neural network Generalized additive model
Online Access	Get full text

Cover

Loading…

More Information
Summary:	•A novel explainable neural network is proposed for modeling main effects and structured interactions.•The GAMI-Net is a disentangled feedforward network with multiple additive subnetworks.•GAMI-Net takes into account three interpretability constraints: sparsity, heredity, marginal clarity.•An adaptive training algorithm is developed for training GAMI-Net efficiently.•GAMI-Net enjoys superior interpretability and outperforms benchmark methods. The lack of interpretability is an inevitable problem when using neural network models in real applications. In this paper, an explainable neural network based on generalized additive models with structured interactions (GAMI-Net) is proposed to pursue a good balance between prediction accuracy and model interpretability. GAMI-Net is a disentangled feedforward network with multiple additive subnetworks; each subnetwork consists of multiple hidden layers and is designed for capturing one main effect or one pairwise interaction. Three interpretability aspects are further considered, including a) sparsity, to select the most significant effects for parsimonious representations; b) heredity, a pairwise interaction could only be included when at least one of its parent main effects exists; and c) marginal clarity, to make main effects and pairwise interactions mutually distinguishable. An adaptive training algorithm is developed, where main effects are first trained and then pairwise interactions are fitted to the residuals. Numerical experiments on both synthetic functions and real-world datasets show that the proposed model enjoys superior interpretability and it maintains competitive prediction accuracy in comparison to the explainable boosting machine and other classic machine learning models.
ISSN:	0031-3203 1873-5142
DOI:	10.1016/j.patcog.2021.108192