A Four Dimension Graph Model for Automatic Text Summarization

Text summarization is the process of automatically creating a shorter version of one or more text documents. In this context, word-based, sentence-based and graph-based methods approaches are largely used. Among these, graph based methods for automatic text summarization produce summaries based on t...

Full description

Saved in:

Bibliographic Details
Published in	2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) Vol. 1; pp. 389 - 396
Main Authors	Ferreira, Rafael, Freitas, Frederico, de Souza Cabral, Luciano, Dueire Lins, Rafael, Lima, Rinaldo, Franca, Gabriel, Simskez, Steven J., Favaro, Luciano
Format	Conference Proceeding
Language	English
Published	IEEE 01.11.2013
Subjects	Graph-Model Measurement uncertainty Proposals Semantics Silicon Summarization Text processing TextRank Vectors
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Text summarization is the process of automatically creating a shorter version of one or more text documents. In this context, word-based, sentence-based and graph-based methods approaches are largely used. Among these, graph based methods for automatic text summarization produce summaries based on the relationships between sentences. These relationships may also support the creation of several text processing applications such as extractive and abstractive summaries, question-answering and information retrieval systems, among others. A new graph model for text processing applications is proposed in this paper. It relies on four dimensions (similarity, semantic similarity, co reference, discourse information) to create the graph. The rationale behind the proposal presented here is resorting to more dimensions than previous works, and taking into account co reference resolution, taking into account to the role of pronouns in connecting the sentences. Co reference was not used in any previous graph based summarization technique. An experiment was performed using the Text Rank algorithm with the presented approach, on the CNN corpus. The results show that the model proposed here outperforms the current approaches both quantitatively and qualitatively.
DOI:	10.1109/WI-IAT.2013.55