Structured abstract summarization of scientific articles: Summarization using full‐text section information
The automatic summarization of scientific articles differs from other text genres because of the structured format and longer text length. Previous approaches have focused on tackling the lengthy nature of scientific articles, aiming to improve the computational efficiency of summarizing long text u...
Saved in:
Published in | Journal of the American Society for Information Science and Technology Vol. 74; no. 2; pp. 234 - 248 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Hoboken
Wiley Periodicals Inc
01.02.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The automatic summarization of scientific articles differs from other text genres because of the structured format and longer text length. Previous approaches have focused on tackling the lengthy nature of scientific articles, aiming to improve the computational efficiency of summarizing long text using a flat, unstructured abstract. However, the structured format of scientific articles and characteristics of each section have not been fully explored, despite their importance. The lack of a sufficient investigation and discussion of various characteristics for each section and their influence on summarization results has hindered the practical use of automatic summarization for scientific articles. To provide a balanced abstract proportionally emphasizing each section of a scientific article, the community introduced the structured abstract, an abstract with distinct, labeled sections. Using this information, in this study, we aim to understand tasks ranging from data preparation to model evaluation from diverse viewpoints. Specifically, we provide a preprocessed large‐scale dataset and propose a summarization method applying the introduction, methods, results, and discussion (IMRaD) format reflecting the characteristics of each section. We also discuss the objective benchmarks and perspectives of state‐of‐the‐art algorithms and present the challenges and research directions in this area. |
---|---|
ISSN: | 2330-1635 2330-1643 |
DOI: | 10.1002/asi.24727 |