Dirichlet Variational Autoencoder

•This paper is a study on Dirichlet prior in variational autoencoder.•Our model outperforms baseline variational autoencoders in the perspective of loglikelihood.•Our model produces more meaningful and interpretable latent representation with no component collapsing compared to baseline variational...

Full description

Saved in:

Bibliographic Details
Published in	Pattern recognition Vol. 107; p. 107514
Main Authors	Joo, Weonyoung, Lee, Wonsung, Park, Sungrae, Moon, Il-Chul
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.11.2020
Subjects	Component collapse Deep generative model Multi-modal latent representation Representation learning Variational autoencoder Representation learning Variational autoencoder Deep generative model Component collapse Multi-modal latent representation
Online Access	Get full text
ISSN	0031-3203 1873-5142
DOI	10.1016/j.patcog.2020.107514

Cover

Loading…

More Information
Summary:	•This paper is a study on Dirichlet prior in variational autoencoder.•Our model outperforms baseline variational autoencoders in the perspective of loglikelihood.•Our model produces more meaningful and interpretable latent representation with no component collapsing compared to baseline variational autoehcoders.•Our model achieves the best classification accuracy in the (semi-)supervised classification tasks compared to baseline variational autoencoders.•Our model shows better performances in topic model augmentation. This paper proposes Dirichlet Variational Autoencoder (DirVAE) using a Dirichlet prior. To infer the parameters of DirVAE, we utilize the stochastic gradient method by approximating the inverse cumulative distribution function of the Gamma distribution, which is a component of the Dirichlet distribution. This approximation on a new prior led an investigation on the component collapsing, and DirVAE revealed that the component collapsing originates from two problem sources: decoder weight collapsing and latent value collapsing. The experimental results show that 1) DirVAE generates the result with the best log-likelihood compared to the baselines; 2) DirVAE produces more interpretable latent values with no collapsing issues which the baselines suffer from; 3) the latent representation from DirVAE achieves the best classification accuracy in the (semi-)supervised classification tasks on MNIST, OMNIGLOT, COIL-20, SVHN, and CIFAR-10 compared to the baseline VAEs; and 4) the DirVAE augmented topic models show better performances in most cases.
ISSN:	0031-3203 1873-5142
DOI:	10.1016/j.patcog.2020.107514