Relation extraction for colorectal cancer via deep learning with entity-aware feature orthogonal decomposition

Relation extraction is significant for text structuring of colorectal cancer (CRC) pathological reports to facilitate doctors’ disease diagnoses. Although many relation extraction methods have been extensively studied for various natural language processing applications, they cannot be well transfer...

Full description

Saved in:
Bibliographic Details
Published inExpert systems with applications Vol. 258; p. 125188
Main Authors Luo, Zhihao, Feng, Jianjun, Cai, Nian, Wang, Xiaodan, Liao, Jiacheng, Li, Quanqing, Peng, Fuqiang, Chen, Chuanwen
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 15.12.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Relation extraction is significant for text structuring of colorectal cancer (CRC) pathological reports to facilitate doctors’ disease diagnoses. Although many relation extraction methods have been extensively studied for various natural language processing applications, they cannot be well transferred to be applied for CRC pathological reports since CRC pathological reports have some unique characteristics. To this end, a deep learning framework is designed in this paper to extract entity relations in CRC pathological reports, which is based on an encoder–decoder architecture with entity-aware feature orthogonal decomposition. Specifically, to effectively extract semantic features of long and short entities, a two-stream encoder is designed based on an edge-aware convolutional neural network and a dimension-aware dilated convolution residual network. To alleviate the influence of the blending of subject–object features, entity-aware feature orthogonal decomposition is designed to decompose the extracted semantic features into three types, i.e. subject features, object features and subject–object shared features. A stage-wise cross entropy loss is proposed to well train the network. Comparison experiments indicated that our designed network performs well on CRC pathological texts with the performance of 92.3% F1 score, 93.1% Precision, and 91.5% Recall, outperforming the existing relation extraction models. [Display omitted] •Joint relation extraction for CRC pathological reports via deep learning.•Edge-aware CNNs and dimension-aware DCRN for extracting entity features.•Entity-aware feature orthogonal decomposition for decomposing entity features.•Stage-wise cross-entropy loss is proposed to well ensure the network training.•Perform well on real CRC pathological reports with the 92.3% F1 score.
ISSN:0957-4174
DOI:10.1016/j.eswa.2024.125188