Interactive Self-Attentive Siamese Network for Biomedical Sentence Similarity

The determination of semantic similarity between sentences is an important component in natural language processing (NLP) tasks such as text retrieval and text summarization. Many approaches have been proposed for estimating sentence similarity, and Siamese neural networks (SNN) provide a better app...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 8; pp. 84093 - 84104
Main Authors Li, Zhengguang, Lin, Hongfei, Zheng, Wei, Tadesse, Michael M., Yang, Zhihao, Wang, Jian
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The determination of semantic similarity between sentences is an important component in natural language processing (NLP) tasks such as text retrieval and text summarization. Many approaches have been proposed for estimating sentence similarity, and Siamese neural networks (SNN) provide a better approach. However, the sentence semantic representation, generated by sharing weights in the SNN without any attention mechanism, ignores the different contributions of different words to the overall sentence semantics. Furthermore, the attention operation within only a single sentence neglects interactive semantic influence on similarity estimation. To address these issues, an interactive self-attention (ISA) mechanism is proposed in this paper and integrated with an SNN, named an interactive self-attentive Siamese neural network (ISA-SNN) which is used to verify the effectiveness of ISA. The proposed model obtains the weights of words in a single sentence by means of self-attention and extracts inherent interactive semantic information between sentences via interactive attention to enhance sentence semantic representation. It achieves better performances without feature engineering than other existing methods on three biomedical benchmark datasets (a Pearson correlation coefficient of 0.656 and 0.713/0.658 on DBMI and CDD-ful/-ref, respectively).
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2020.2985685