Survey of Automatic Labeling Methods for Topic Models

Topic models are often used in modeling unstructured corpora and discrete data to extract the latent topic. As topics are generally expressed in the form of word lists, it is usually difficult for users to understand the meanings of topics, especially when users lack knowledge in the subject area. A...

Full description

Saved in:

Bibliographic Details
Published in	Jisuanji kexue yu tansuo Vol. 17; no. 12; pp. 2861 - 2879
Main Author	HE Dongbin, TAO Sha, ZHU Yanhong, REN Yanzhao, CHU Yunxia
Format	Journal Article
Language	Chinese
Published	Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press 01.12.2023
Subjects	topic model; latent dirichlet allocation (lda); topic labeling; topic label
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Topic models are often used in modeling unstructured corpora and discrete data to extract the latent topic. As topics are generally expressed in the form of word lists, it is usually difficult for users to understand the meanings of topics, especially when users lack knowledge in the subject area. Although manually labeling topics can generate more explanatory and easily understandable topic labels, the cost is too high for the method to be feasible. Therefore, research on automatic labeling of topic discovered provides solutions to the problem. Firstly, the currently most popular technique, latent Dirichlet allocation (LDA), is elaborated and analyzed. According to the three different representations of topic labels, based on phrases, abstracts, and pictures, the topic labeling methods are classified into three types. Then, centered on improving the interpretability of topics, with different types of generated topic labels utilized, the relevant research in recent years is sorted out, analyzed, and summarize
ISSN:	1673-9418
DOI:	10.3778/j.issn.1673-9418.2303083