Tipster: A Topic-Guided Language Model for Topic-Aware Text Segmentation
The accurate segmentation and structural topics of plain documents not only meet people’s reading habit, but also facilitate various downstream tasks. Recently, some works have consistently given positive hints that text segmentation and segment topic labeling could be regarded as a mutual task, and...
Saved in:
Published in | Database Systems for Advanced Applications Vol. 13247; pp. 213 - 221 |
---|---|
Main Authors | , , , , , , |
Format | Book Chapter |
Language | English |
Published |
Switzerland
Springer International Publishing AG
2022
Springer International Publishing |
Series | Lecture Notes in Computer Science |
Subjects | |
Online Access | Get full text |
ISBN | 3031001281 9783031001284 |
ISSN | 0302-9743 1611-3349 |
DOI | 10.1007/978-3-031-00129-1_14 |
Cover
Loading…
Summary: | The accurate segmentation and structural topics of plain documents not only meet people’s reading habit, but also facilitate various downstream tasks. Recently, some works have consistently given positive hints that text segmentation and segment topic labeling could be regarded as a mutual task, and cooperating with word distributions has the potential to model latent topics in a certain document better. To this end, we present a novel model namely Tipster to solve text segmentation and segment topic labeling collaboratively. We first utilize a neural topic model to infer latent topic distributions of sentences considering word distributions. Then, our model divides the document into topically coherent segments based on the topic-guided contextual sentence representations of the pre-trained language model and assign relevant topic labels to each segment. Finally, we conduct extensive experiments which demonstrate that Tipster achieves the state-of-the-art performance in both text segmentation and segment topic labeling tasks. |
---|---|
ISBN: | 3031001281 9783031001284 |
ISSN: | 0302-9743 1611-3349 |
DOI: | 10.1007/978-3-031-00129-1_14 |