Integrating Topic-Aware Heterogeneous Graph Neural Network with Transformer Model for Medical Scientific Document Abstractive Summarization

The development of abstractive summarization methods is a crucial task in Natural Language Processing (NLP) that presents challenges, which require the creation of intelligent systems that are capable of extracting the main idea from text effectively and generate coherent summary. Numerous existing...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 12; p. 1
Main Authors Khaliq, Ayesha, Khan, Atif, Awan, Salman Afsar, Jan, Salman, Umair, Muhammad, Zuhairi, Megat Farez Azril
Format Journal Article
LanguageEnglish
Published IEEE 01.01.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The development of abstractive summarization methods is a crucial task in Natural Language Processing (NLP) that presents challenges, which require the creation of intelligent systems that are capable of extracting the main idea from text effectively and generate coherent summary. Numerous existing abstractive approaches do not take into account the importance of the broader context or fail to capture the global semantics in identifying salient content for summary. Moreover, there is lack of studies that extensively evaluated abstractive summarization models for specific domains, such as medical scientific document summarization. With this motivation behind, this paper developed an integrated framework for abstractive summarization of medical scientific documents that integrates topic-aware Heterogeneous Graph Neural Network with a Transformer model. The suggested framework uses Latent Dirichlet Allocation (LDA) for topic modeling to uncover latent topics and global information, thus preserving document-level attributes important for creation of effective summaries. In addition to topic modeling, the framework utilized a Heterogeneous Graph Neural Network (HGNN), capable of capturing the relationship between sentences through graph-based document representation, and allows for the concurrent updating of both local and global information. Finally, the framework is integrated with a Transformer decoder, which greatly enhances the ability of model to produce accurate and informative abstractive summaries. The performance of propose framework is evaluated on publicly available PubMed dataset related to medical scientific papers. Experimental results illustrate that the suggested framework for abstractive summarization showed superior performance as compared to the state-of-the-art models, achieving high F1-Scores: 46.03 for Rouge-1, 21.42 for Rouge-2, and 39.71 for Rouge-L.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3443730