Tipster: A Topic-Guided Language Model for Topic-Aware Text Segmentation

The accurate segmentation and structural topics of plain documents not only meet people’s reading habit, but also facilitate various downstream tasks. Recently, some works have consistently given positive hints that text segmentation and segment topic labeling could be regarded as a mutual task, and...

Full description

Saved in:
Bibliographic Details
Published inDatabase Systems for Advanced Applications Vol. 13247; pp. 213 - 221
Main Authors Gong, Zheng, Tong, Shiwei, Wu, Han, Liu, Qi, Tao, Hanqing, Huang, Wei, Yu, Runlong
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2022
Springer International Publishing
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text
ISBN3031001281
9783031001284
ISSN0302-9743
1611-3349
DOI10.1007/978-3-031-00129-1_14

Cover

Loading…
Abstract The accurate segmentation and structural topics of plain documents not only meet people’s reading habit, but also facilitate various downstream tasks. Recently, some works have consistently given positive hints that text segmentation and segment topic labeling could be regarded as a mutual task, and cooperating with word distributions has the potential to model latent topics in a certain document better. To this end, we present a novel model namely Tipster to solve text segmentation and segment topic labeling collaboratively. We first utilize a neural topic model to infer latent topic distributions of sentences considering word distributions. Then, our model divides the document into topically coherent segments based on the topic-guided contextual sentence representations of the pre-trained language model and assign relevant topic labels to each segment. Finally, we conduct extensive experiments which demonstrate that Tipster achieves the state-of-the-art performance in both text segmentation and segment topic labeling tasks.
AbstractList The accurate segmentation and structural topics of plain documents not only meet people’s reading habit, but also facilitate various downstream tasks. Recently, some works have consistently given positive hints that text segmentation and segment topic labeling could be regarded as a mutual task, and cooperating with word distributions has the potential to model latent topics in a certain document better. To this end, we present a novel model namely Tipster to solve text segmentation and segment topic labeling collaboratively. We first utilize a neural topic model to infer latent topic distributions of sentences considering word distributions. Then, our model divides the document into topically coherent segments based on the topic-guided contextual sentence representations of the pre-trained language model and assign relevant topic labels to each segment. Finally, we conduct extensive experiments which demonstrate that Tipster achieves the state-of-the-art performance in both text segmentation and segment topic labeling tasks.
Author Gong, Zheng
Wu, Han
Yu, Runlong
Huang, Wei
Tong, Shiwei
Tao, Hanqing
Liu, Qi
Author_xml – sequence: 1
  givenname: Zheng
  surname: Gong
  fullname: Gong, Zheng
– sequence: 2
  givenname: Shiwei
  surname: Tong
  fullname: Tong, Shiwei
– sequence: 3
  givenname: Han
  surname: Wu
  fullname: Wu, Han
– sequence: 4
  givenname: Qi
  surname: Liu
  fullname: Liu, Qi
  email: qiliuql@ustc.edu.cn
– sequence: 5
  givenname: Hanqing
  surname: Tao
  fullname: Tao, Hanqing
– sequence: 6
  givenname: Wei
  surname: Huang
  fullname: Huang, Wei
– sequence: 7
  givenname: Runlong
  surname: Yu
  fullname: Yu, Runlong
BookMark eNpFkMtOwzAQRQ0URFv6ByzyAwaPx_GDXVVBi1TEgiCxs5zECYGShCQVfD7uQ2IxGulendHoTMiobmpPyDWwG2BM3RqlKVKGQBkDbihYECdkgiHZB2-nZAwSgCIKc_ZfaBiRMUPGqVECL8gEELhkQqK4JLO-_2CMccUhsGOySqq2H3x3F82jpGmrjC63Ve7zaO3qcutKHz01ud9ERdMd-_mP63yU-N8hevHll68HN1RNfUXOC7fp_ey4p-T14T5ZrOj6efm4mK9pywUOVMQpj9OUG5NKnjET6zxFk-XGFegKXWiM84wpFQMqKU3mdKG80rlSqEEVCqeEH-72bVfVpe9s2jSfvQVmd9ZssGbRBhV2L8nurAVIHKC2a763vh-s31FZeL5zm-zdtUFBb6WRIEBajiKMwj-KamuJ
ContentType Book Chapter
Copyright The Author(s), under exclusive license to Springer Nature Switzerland AG 2022
Copyright_xml – notice: The Author(s), under exclusive license to Springer Nature Switzerland AG 2022
DBID FFUUA
DEWEY 005.7565
DOI 10.1007/978-3-031-00129-1_14
DatabaseName ProQuest Ebook Central - Book Chapters - Demo use only
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 303100129X
9783031001291
EISSN 1611-3349
Editor Lee Mong Li, Janice
Bhattacharya, Arnab
Agrawal, Divyakant
Goyal, Vikram
Reddy, P. Krishna
Uday Kiran, Rage
Mohania, Mukesh
Mondal, Anirban
Editor_xml – sequence: 1
  fullname: Lee Mong Li, Janice
– sequence: 2
  fullname: Bhattacharya, Arnab
– sequence: 3
  fullname: Agrawal, Divyakant
– sequence: 4
  fullname: Goyal, Vikram
– sequence: 5
  fullname: Reddy, P. Krishna
– sequence: 6
  fullname: Uday Kiran, Rage
– sequence: 7
  fullname: Mohania, Mukesh
– sequence: 8
  fullname: Mondal, Anirban
EndPage 221
ExternalDocumentID EBC6961416_234_237
GroupedDBID 38.
AABBV
AAZWU
ABSVR
ABTHU
ABVND
ACBPT
ACHZO
ACPMC
ADNVS
AEDXK
AEJLV
AEKFX
AHVRR
AIYYB
ALMA_UNASSIGNED_HOLDINGS
BBABE
CZZ
FFUUA
I4C
IEZ
SBO
TPJZQ
TSXQS
Z5O
Z7R
Z7U
Z7W
Z7X
Z7Z
Z81
Z83
Z84
Z85
Z87
Z88
-DT
-GH
-~X
1SB
29L
2HA
2HV
5QI
875
AASHB
ABMNI
ACGFS
ADCXD
AEFIE
EJD
F5P
FEDTE
HVGLF
LAS
LDH
P2P
RIG
RNI
RSU
SVGTG
VI1
~02
ID FETCH-LOGICAL-p243t-45b25bb299b62c0958db39cd9af3af8f835dc0775137669ca8f7e78d773817f73
ISBN 3031001281
9783031001284
ISSN 0302-9743
IngestDate Tue Jul 29 20:22:16 EDT 2025
Thu May 29 16:38:08 EDT 2025
IsPeerReviewed true
IsScholarly true
LCCallNum QA76.9.D343
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-p243t-45b25bb299b62c0958db39cd9af3af8f835dc0775137669ca8f7e78d773817f73
OCLC 1312604634
PQID EBC6961416_234_237
PageCount 9
ParticipantIDs springer_books_10_1007_978_3_031_00129_1_14
proquest_ebookcentralchapters_6961416_234_237
PublicationCentury 2000
PublicationDate 2022
PublicationDateYYYYMMDD 2022-01-01
PublicationDate_xml – year: 2022
  text: 2022
PublicationDecade 2020
PublicationPlace Switzerland
PublicationPlace_xml – name: Switzerland
– name: Cham
PublicationSeriesTitle Lecture Notes in Computer Science
PublicationSeriesTitleAlternate Lect.Notes Computer
PublicationSubtitle 27th International Conference, DASFAA 2022, Virtual Event, April 11-14, 2022, Proceedings, Part III
PublicationTitle Database Systems for Advanced Applications
PublicationYear 2022
Publisher Springer International Publishing AG
Springer International Publishing
Publisher_xml – name: Springer International Publishing AG
– name: Springer International Publishing
RelatedPersons Hartmanis, Juris
Gao, Wen
Bertino, Elisa
Woeginger, Gerhard
Goos, Gerhard
Steffen, Bernhard
Yung, Moti
RelatedPersons_xml – sequence: 1
  givenname: Gerhard
  surname: Goos
  fullname: Goos, Gerhard
– sequence: 2
  givenname: Juris
  surname: Hartmanis
  fullname: Hartmanis, Juris
– sequence: 3
  givenname: Elisa
  surname: Bertino
  fullname: Bertino, Elisa
– sequence: 4
  givenname: Wen
  surname: Gao
  fullname: Gao, Wen
– sequence: 5
  givenname: Bernhard
  orcidid: 0000-0001-9619-1558
  surname: Steffen
  fullname: Steffen, Bernhard
– sequence: 6
  givenname: Gerhard
  orcidid: 0000-0001-8816-2693
  surname: Woeginger
  fullname: Woeginger, Gerhard
– sequence: 7
  givenname: Moti
  orcidid: 0000-0003-0848-0873
  surname: Yung
  fullname: Yung, Moti
SSID ssj0002721161
ssj0002792
Score 2.115819
Snippet The accurate segmentation and structural topics of plain documents not only meet people’s reading habit, but also facilitate various downstream tasks....
SourceID springer
proquest
SourceType Publisher
StartPage 213
SubjectTerms Language model
Neural topic model
Text segmentation
Title Tipster: A Topic-Guided Language Model for Topic-Aware Text Segmentation
URI http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=6961416&ppg=237
http://link.springer.com/10.1007/978-3-031-00129-1_14
Volume 13247
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwELZoWRADb_GWBzYU1DixnbAFVFohYCqIzfKjQR1oKxqExK_n7MQ0ibrA0KhK4si6zz6fz_fdIXSRxiakJooDzSlsUKiOYc6FaQDvU9VTUuXSEoUfn9jwOb5_pa_Lkn6OXVKoK_29klfyH1ThHuBqWbJ_QPb3o3AD_gO-cAWE4doyfptu1jKWRRbSrkE-6bgLGMz8kX5WO5huDIvJ3NXicIT00Ww-0cHgc2KgxUPluXTl0RyrsXqefdnosBFocdAsb-8VWWla9xcQ0vIXeH9hy-NYc3plg8YeM7LJQ90y1lCaYIjxlSq4HnUBTQPn6gpCUXJFmxmvSZnwpZXxun9zy1IwG0ImSBTDj3dQhye0i9az_v3Dy68PjdjtKwstZcd3MiyTKi07XaNLrupTY2PROgt3JsZoG21a2gm2fBDo5Q5aG0930ZYvuoErHbyHhhWC1zjDdfywxw87_DDgh2v4YYsfruO3j57v-qPbYVCVwwjmJI4KmEiKUKXAflCMaDCNE6OiVJtU5pHMkxxsaaNtRsMQFg2WapnkfMwTw7nNwpjz6AB1p7Pp-BBhBc0V7zFpKI9jScFKNCyBWSuNZiTtHaHAS0W4Q_sqUliXMliIFj5H6NKLTtjXF8JnwwaZi0iAzIWTubAyP_7j10_QxnIgn6Ju8fE5PgNTsFDn1Yj4AbBCV-w
linkProvider Library Specific Holdings
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Database+Systems+for+Advanced+Applications&rft.atitle=Tipster%3A+A+Topic-Guided+Language+Model+for+Topic-Aware+Text+Segmentation&rft.date=2022-01-01&rft.pub=Springer+International+Publishing+AG&rft.isbn=9783031001284&rft.volume=13247&rft_id=info:doi/10.1007%2F978-3-031-00129-1_14&rft.externalDBID=237&rft.externalDocID=EBC6961416_234_237
thumbnail_s http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F6961416-l.jpg