Survey of Automatic Labeling Methods for Topic Models
Topic models are often used in modeling unstructured corpora and discrete data to extract the latent topic. As topics are generally expressed in the form of word lists, it is usually difficult for users to understand the meanings of topics, especially when users lack knowledge in the subject area. A...
Saved in:
Published in | Jisuanji kexue yu tansuo Vol. 17; no. 12; pp. 2861 - 2879 |
---|---|
Main Author | |
Format | Journal Article |
Language | Chinese |
Published |
Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
01.12.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Topic models are often used in modeling unstructured corpora and discrete data to extract the latent topic. As topics are generally expressed in the form of word lists, it is usually difficult for users to understand the meanings of topics, especially when users lack knowledge in the subject area. Although manually labeling topics can generate more explanatory and easily understandable topic labels, the cost is too high for the method to be feasible. Therefore, research on automatic labeling of topic discovered provides solutions to the problem. Firstly, the currently most popular technique, latent Dirichlet allocation (LDA), is elaborated and analyzed. According to the three different representations of topic labels, based on phrases, abstracts, and pictures, the topic labeling methods are classified into three types. Then, centered on improving the interpretability of topics, with different types of generated topic labels utilized, the relevant research in recent years is sorted out, analyzed, and summarize |
---|---|
ISSN: | 1673-9418 |
DOI: | 10.3778/j.issn.1673-9418.2303083 |