A semantic graph-based approach to biomedical summarisation
Abstract Objective Access to the vast body of research literature that is available in biomedicine and related fields may be improved by automatic summarisation. This paper presents a method for summarising biomedical scientific literature that takes into consideration the characteristics of the dom...
Saved in:
Published in | Artificial intelligence in medicine Vol. 53; no. 1; pp. 1 - 14 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Netherlands
Elsevier B.V
01.09.2011
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Abstract Objective Access to the vast body of research literature that is available in biomedicine and related fields may be improved by automatic summarisation. This paper presents a method for summarising biomedical scientific literature that takes into consideration the characteristics of the domain and the type of documents. Methods To address the problem of identifying salient sentences in biomedical texts, concepts and relations derived from the Unified Medical Language System (UMLS) are arranged to construct a semantic graph that represents the document. A degree-based clustering algorithm is then used to identify different themes or topics within the text. Different heuristics for sentence selection, intended to generate different types of summaries, are tested. A real document case is drawn up to illustrate how the method works. Results A large-scale evaluation is performed using the recall-oriented understudy for gisting-evaluation (ROUGE) metrics. The results are compared with those achieved by three well-known summarisers (two research prototypes and a commercial application) and two baselines. Our method significantly outperforms all summarisers and baselines. The best of our heuristics achieves an improvement in performance of almost 7.7 percentage units in the ROUGE-1 score over the LexRank summariser (0.7862 versus 0.7302). A qualitative analysis of the summaries also shows that our method succeeds in identifying sentences that cover the main topic of the document and also considers other secondary or “satellite” information that might be relevant to the user. Conclusion The method proposed is proved to be an efficient approach to biomedical literature summarisation, which confirms that the use of concepts rather than terms can be very useful in automatic summarisation, especially when dealing with highly specialised domains. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0933-3657 1873-2860 |
DOI: | 10.1016/j.artmed.2011.06.005 |