Relation extraction for colorectal cancer via deep learning with entity-aware feature orthogonal decomposition
Relation extraction is significant for text structuring of colorectal cancer (CRC) pathological reports to facilitate doctors’ disease diagnoses. Although many relation extraction methods have been extensively studied for various natural language processing applications, they cannot be well transfer...
Saved in:
Published in | Expert systems with applications Vol. 258; p. 125188 |
---|---|
Main Authors | , , , , , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
15.12.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Relation extraction is significant for text structuring of colorectal cancer (CRC) pathological reports to facilitate doctors’ disease diagnoses. Although many relation extraction methods have been extensively studied for various natural language processing applications, they cannot be well transferred to be applied for CRC pathological reports since CRC pathological reports have some unique characteristics. To this end, a deep learning framework is designed in this paper to extract entity relations in CRC pathological reports, which is based on an encoder–decoder architecture with entity-aware feature orthogonal decomposition. Specifically, to effectively extract semantic features of long and short entities, a two-stream encoder is designed based on an edge-aware convolutional neural network and a dimension-aware dilated convolution residual network. To alleviate the influence of the blending of subject–object features, entity-aware feature orthogonal decomposition is designed to decompose the extracted semantic features into three types, i.e. subject features, object features and subject–object shared features. A stage-wise cross entropy loss is proposed to well train the network. Comparison experiments indicated that our designed network performs well on CRC pathological texts with the performance of 92.3% F1 score, 93.1% Precision, and 91.5% Recall, outperforming the existing relation extraction models.
[Display omitted]
•Joint relation extraction for CRC pathological reports via deep learning.•Edge-aware CNNs and dimension-aware DCRN for extracting entity features.•Entity-aware feature orthogonal decomposition for decomposing entity features.•Stage-wise cross-entropy loss is proposed to well ensure the network training.•Perform well on real CRC pathological reports with the 92.3% F1 score. |
---|---|
ISSN: | 0957-4174 |
DOI: | 10.1016/j.eswa.2024.125188 |