Neural Topic Modeling with Bidirectional Adversarial Training
Recent years have witnessed a surge of interests of using neural topic models for automatic topic extraction from text, since they avoid the complicated mathematical derivations for model inference as in traditional topic models such as Latent Dirichlet Allocation (LDA). However, these models either...
Saved in:
Main Authors | , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
26.04.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Recent years have witnessed a surge of interests of using neural topic models
for automatic topic extraction from text, since they avoid the complicated
mathematical derivations for model inference as in traditional topic models
such as Latent Dirichlet Allocation (LDA). However, these models either
typically assume improper prior (e.g. Gaussian or Logistic Normal) over latent
topic space or could not infer topic distribution for a given document. To
address these limitations, we propose a neural topic modeling approach, called
Bidirectional Adversarial Topic (BAT) model, which represents the first attempt
of applying bidirectional adversarial training for neural topic modeling. The
proposed BAT builds a two-way projection between the document-topic
distribution and the document-word distribution. It uses a generator to capture
the semantic patterns from texts and an encoder for topic inference.
Furthermore, to incorporate word relatedness information, the Bidirectional
Adversarial Topic model with Gaussian (Gaussian-BAT) is extended from BAT. To
verify the effectiveness of BAT and Gaussian-BAT, three benchmark corpora are
used in our experiments. The experimental results show that BAT and
Gaussian-BAT obtain more coherent topics, outperforming several competitive
baselines. Moreover, when performing text clustering based on the extracted
topics, our models outperform all the baselines, with more significant
improvements achieved by Gaussian-BAT where an increase of near 6\% is observed
in accuracy. |
---|---|
DOI: | 10.48550/arxiv.2004.12331 |