Subtopic-Focused Sentence Scoring in Multi-document Summarization

In previous works, subtopics are seldom mentioned in multi-document summarization while only one topic is focused to extract summary. In this paper, we propose a subtopic- focused model to score sentences in the extractive summarization task. Different with supervised methods, it does not require co...

Full description

Saved in:

Bibliographic Details
Published in	Sixth International Conference on Advanced Language Processing and Web Information Technology (ALPIT 2007) pp. 98 - 104
Main Authors	Sujian, Li, Weiguang, Qu
Format	Conference Proceeding
Language	English
Published	IEEE 01.08.2007
Subjects	Bayesian methods Computational linguistics Data mining Humans Information technology Linear discriminant analysis Performance evaluation Robustness Unsupervised learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In previous works, subtopics are seldom mentioned in multi-document summarization while only one topic is focused to extract summary. In this paper, we propose a subtopic- focused model to score sentences in the extractive summarization task. Different with supervised methods, it does not require costly manual work to form the training set. Multiple documents are represented as mixture over subtopics, denoted by term distributions through unsupervised learning. Our method learns the subtopic distribution over sentences via a hierarchical Bayesian model, through which sentences are scored and extracted as summary. Experiments on DUC 2006 data are performed and the ROUGE evaluation results show that the proposed method can reach the state-of-the-art performance.
ISBN:	0769529305 9780769529301
DOI:	10.1109/ALPIT.2007.106