A semantic graph-based approach to biomedical summarisation

Abstract Objective Access to the vast body of research literature that is available in biomedicine and related fields may be improved by automatic summarisation. This paper presents a method for summarising biomedical scientific literature that takes into consideration the characteristics of the dom...

Full description

Saved in:

Bibliographic Details
Published in	Artificial intelligence in medicine Vol. 53; no. 1; pp. 1 - 14
Main Authors	Plaza, Laura, Díaz, Alberto, Gervás, Pablo
Format	Journal Article
Language	English
Published	Netherlands Elsevier B.V 01.09.2011
Subjects	Algorithms Biomedical concept annotation Biomedical text summarisation Cluster Analysis Concept clustering Dealing Heuristic Information Storage and Retrieval - methods Internal Medicine Medical Natural Language Processing Other Pattern Recognition, Automated Periodicals as Topic Semantic graphs Semantics Sentences Subject Headings Summaries Texts Unified Medical Language System Unified Medical Language System Biomedical concept annotation Concept clustering Semantic graphs Biomedical text summarisation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Abstract Objective Access to the vast body of research literature that is available in biomedicine and related fields may be improved by automatic summarisation. This paper presents a method for summarising biomedical scientific literature that takes into consideration the characteristics of the domain and the type of documents. Methods To address the problem of identifying salient sentences in biomedical texts, concepts and relations derived from the Unified Medical Language System (UMLS) are arranged to construct a semantic graph that represents the document. A degree-based clustering algorithm is then used to identify different themes or topics within the text. Different heuristics for sentence selection, intended to generate different types of summaries, are tested. A real document case is drawn up to illustrate how the method works. Results A large-scale evaluation is performed using the recall-oriented understudy for gisting-evaluation (ROUGE) metrics. The results are compared with those achieved by three well-known summarisers (two research prototypes and a commercial application) and two baselines. Our method significantly outperforms all summarisers and baselines. The best of our heuristics achieves an improvement in performance of almost 7.7 percentage units in the ROUGE-1 score over the LexRank summariser (0.7862 versus 0.7302). A qualitative analysis of the summaries also shows that our method succeeds in identifying sentences that cover the main topic of the document and also considers other secondary or “satellite” information that might be relevant to the user. Conclusion The method proposed is proved to be an efficient approach to biomedical literature summarisation, which confirms that the use of concepts rather than terms can be very useful in automatic summarisation, especially when dealing with highly specialised domains.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0933-3657 1873-2860
DOI:	10.1016/j.artmed.2011.06.005