Sparse encoding for more-interpretable feature-selecting representations in probabilistic matrix factorization
Dimensionality reduction methods for count data are critical to a wide range of applications in medical informatics and other fields where model interpretability is paramount. For such data, hierarchical Poisson matrix factorization (HPF) and other sparse probabilistic non-negative matrix factorizat...
Saved in:
Main Authors | , , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
07.12.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Dimensionality reduction methods for count data are critical to a wide range
of applications in medical informatics and other fields where model
interpretability is paramount. For such data, hierarchical Poisson matrix
factorization (HPF) and other sparse probabilistic non-negative matrix
factorization (NMF) methods are considered to be interpretable generative
models. They consist of sparse transformations for decoding their learned
representations into predictions. However, sparsity in representation decoding
does not necessarily imply sparsity in the encoding of representations from the
original data features. HPF is often incorrectly interpreted in the literature
as if it possesses encoder sparsity. The distinction between decoder sparsity
and encoder sparsity is subtle but important. Due to the lack of encoder
sparsity, HPF does not possess the column-clustering property of classical NMF
-- the factor loading matrix does not sufficiently define how each factor is
formed from the original features. We address this deficiency by
self-consistently enforcing encoder sparsity, using a generalized additive
model (GAM), thereby allowing one to relate each representation coordinate to a
subset of the original data features. In doing so, the method also gains the
ability to perform feature selection. We demonstrate our method on simulated
data and give an example of how encoder sparsity is of practical use in a
concrete application of representing inpatient comorbidities in Medicare
patients. |
---|---|
Bibliography: | ICLR 2021 |
DOI: | 10.48550/arxiv.2012.04171 |