Summarisation of the logical structure of XML documents

► We investigate the summarisation of the logical structure of XML documents. ► The structure is summarised using a method inspired by extractive text summarisation. ► We discuss and evaluate features used in structure summary generation. ► We present and compare ways to evaluate structure summaries...

Full description

Saved in:
Bibliographic Details
Published inInformation processing & management Vol. 48; no. 5; pp. 956 - 968
Main Authors Szlávik, Zoltán, Tombros, Anastasios, Lalmas, Mounia
Format Journal Article
LanguageEnglish
Published Kidlington Elsevier Ltd 01.09.2012
Elsevier
Elsevier Science Ltd
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:► We investigate the summarisation of the logical structure of XML documents. ► The structure is summarised using a method inspired by extractive text summarisation. ► We discuss and evaluate features used in structure summary generation. ► We present and compare ways to evaluate structure summaries. Summarisation is traditionally used to produce summaries of the textual contents of documents. In this paper, it is argued that summarisation methods can also be applied to the logical structure of XML documents. Structure summarisation selects the most important elements of the logical structure and ensures that the user’s attention is focused towards sections, subsections, etc. that are believed to be of particular interest. Structure summaries are shown to users as hierarchical tables of contents. This paper discusses methods for structure summarisation that use various features of XML elements in order to select document portions that a user’s attention should be focused to. An evaluation methodology for structure summarisation is also introduced and summarisation results using various summariser versions are presented and compared to one another. We show that data sets used in information retrieval evaluation can be used effectively in order to produce high quality (query independent) structure summaries. We also discuss the choice and effectiveness of particular summariser features with respect to several evaluation measures.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Article-2
ObjectType-Feature-1
ISSN:0306-4573
1873-5371
DOI:10.1016/j.ipm.2011.11.002