Automatic summarization of scientific publications using a feature selection approach

Feature Maximization is a feature selection method that deals efficiently with textual data: to design systems that are altogether language-agnostic , parameter-free and do not require additional corpora to function. We propose to evaluate its use in text summarization, in particular in cases where...

Full description

Saved in:
Bibliographic Details
Published inInternational journal on digital libraries Vol. 19; no. 2-3; pp. 203 - 215
Main Authors Al Saied, Hazem, Dugué, Nicolas, Lamirel, Jean-Charles
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.09.2018
Springer Verlag
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Feature Maximization is a feature selection method that deals efficiently with textual data: to design systems that are altogether language-agnostic , parameter-free and do not require additional corpora to function. We propose to evaluate its use in text summarization, in particular in cases where documents are structured . We first experiment this approach in a single-document summarization context. We evaluate it on the DUC AQUAINT corpus and show that despite the unstructured nature of the corpus, our system is above the baseline and produces encouraging results. We also observe that the produced summaries seem robust to redundancy. Next, we evaluate our method in the more appropriate context of SciSumm challenge, which is dedicated to research publications summarization. These publications are structured in sections and our class-based approach is thus relevant. We more specifically focus on the task that aims to summarize papers using those that refer to them. We consider and evaluate several systems using our approach dealing with specific bag of words. Furthermore, in these systems, we also evaluate cosine and graph-based distance for sentence weighting and comparison. We show that our Feature Maximization based approach performs very well in the SciSumm 2016 context for the considered task, providing better results than the known results so far, and obtaining high recall. We thus demonstrate the flexibility and the relevance of Feature Maximization in this context.
ISSN:1432-5012
1432-1300
DOI:10.1007/s00799-017-0214-x