Structured Summarization: Unified Text Segmentation and Segment Labeling as a Generation Task
Text segmentation aims to divide text into contiguous, semantically coherent segments, while segment labeling deals with producing labels for each segment. Past work has shown success in tackling segmentation and labeling for documents and conversations. This has been possible with a combination of...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
27.09.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Text segmentation aims to divide text into contiguous, semantically coherent
segments, while segment labeling deals with producing labels for each segment.
Past work has shown success in tackling segmentation and labeling for documents
and conversations. This has been possible with a combination of task-specific
pipelines, supervised and unsupervised learning objectives. In this work, we
propose a single encoder-decoder neural network that can handle long documents
and conversations, trained simultaneously for both segmentation and segment
labeling using only standard supervision. We successfully show a way to solve
the combined task as a pure generation task, which we refer to as structured
summarization. We apply the same technique to both document and conversational
data, and we show state of the art performance across datasets for both
segmentation and labeling, under both high- and low-resource settings. Our
results establish a strong case for considering text segmentation and segment
labeling as a whole, and moving towards general-purpose techniques that don't
depend on domain expertise or task-specific components. |
---|---|
DOI: | 10.48550/arxiv.2209.13759 |