A large-scale dataset for korean document-level relation extraction from encyclopedia texts
Document-level relation extraction (RE) aims to predict the relational facts between two given entities from a document. Unlike widespread research on document-level RE in English, Korean document-level RE research is still at the very beginning due to the absence of a dataset. To accelerate the stu...
Saved in:
Published in | Applied intelligence (Dordrecht, Netherlands) Vol. 54; no. 17-18; pp. 8681 - 8701 |
---|---|
Main Authors | , , , , , , , |
Format | Journal Article |
Language | English |
Published |
New York
Springer US
01.09.2024
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Document-level relation extraction (RE) aims to predict the relational facts between two given entities from a document. Unlike widespread research on document-level RE in English, Korean document-level RE research is still at the very beginning due to the absence of a dataset. To accelerate the studies, we present TREK (
T
oward Document-Level
R
elation
E
xtraction in
K
orean) dataset constructed from Korean encyclopedia documents written by the domain experts. We provide detailed statistical analyses for our large-scale dataset and human evaluation results suggest the assured quality of TREK . Also, we introduce the document-level RE model that considers the named entity-type while considering the Korean language’s properties. In the experiments, we demonstrate that our proposed model outperforms the baselines and conduct qualitative analysis. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 0924-669X 1573-7497 |
DOI: | 10.1007/s10489-024-05605-9 |