Open Relation Extraction for Chinese Noun Phrases

Relation Extraction (RE) aims at harvesting relational facts from texts. A majority of existing research targets at knowledge acquisition from sentences, where subject-verb-object structures are usually treated as the signals of existence of relations. In contrast, relational facts expressed within...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on knowledge and data engineering Vol. 33; no. 6; pp. 2693 - 2708
Main Authors Wang, Chengyu, He, Xiaofeng, Zhou, Aoying
Format Journal Article
LanguageEnglish
Published New York IEEE 01.06.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1041-4347
1558-2191
DOI10.1109/TKDE.2019.2953839

Cover

Loading…
Abstract Relation Extraction (RE) aims at harvesting relational facts from texts. A majority of existing research targets at knowledge acquisition from sentences, where subject-verb-object structures are usually treated as the signals of existence of relations. In contrast, relational facts expressed within noun phrases are highly implicit. Previous works mostly relies on human-compiled assertions and textual patterns in English to address noun phrase-based RE. For Chinese, the corresponding task is non-trivial because Chinese is a highly analytic language with flexible expressions. Additionally, noun phrases tend to be incomplete in grammatical structures, where clear mentions of predicates are often missing. In this article, we present an unsupervised Noun Phrase-based Open RE system for the Chinese language (NPORE), which employs a three-layer data-driven architecture. The system contains three components, i.e., Modifier-sensitive Phrase Segmenter, Candidate Relation Generator and Missing Relation Predicate Detector. It integrates with a graph clique mining algorithm to chunk Chinese noun phrases, considering how relations are expressed. We further propose a probabilistic method with knowledge priors and a hypergraph-based random walk process to detect missing relation predicates. Experiments over Chinese Wikipedia show NPORE outperforms state-of-the-art, capable of extracting 55.2 percent more relations than the most competitive baseline, with a comparable precision at 95.4 percent.
AbstractList Relation Extraction (RE) aims at harvesting relational facts from texts. A majority of existing research targets at knowledge acquisition from sentences, where subject-verb-object structures are usually treated as the signals of existence of relations. In contrast, relational facts expressed within noun phrases are highly implicit. Previous works mostly relies on human-compiled assertions and textual patterns in English to address noun phrase-based RE. For Chinese, the corresponding task is non-trivial because Chinese is a highly analytic language with flexible expressions. Additionally, noun phrases tend to be incomplete in grammatical structures, where clear mentions of predicates are often missing. In this article, we present an unsupervised Noun Phrase-based Open RE system for the Chinese language (NPORE), which employs a three-layer data-driven architecture. The system contains three components, i.e., Modifier-sensitive Phrase Segmenter, Candidate Relation Generator and Missing Relation Predicate Detector. It integrates with a graph clique mining algorithm to chunk Chinese noun phrases, considering how relations are expressed. We further propose a probabilistic method with knowledge priors and a hypergraph-based random walk process to detect missing relation predicates. Experiments over Chinese Wikipedia show NPORE outperforms state-of-the-art, capable of extracting 55.2 percent more relations than the most competitive baseline, with a comparable precision at 95.4 percent.
Author Zhou, Aoying
He, Xiaofeng
Wang, Chengyu
Author_xml – sequence: 1
  givenname: Chengyu
  orcidid: 0000-0003-1010-9678
  surname: Wang
  fullname: Wang, Chengyu
  email: chywang2013@gmail.com
  organization: School of Software Engineering, East China Normal University, Shanghai, China
– sequence: 2
  givenname: Xiaofeng
  orcidid: 0000-0002-6911-348X
  surname: He
  fullname: He, Xiaofeng
  email: hexf@cs.ecnu.edu.cn
  organization: School of Computer Science and Technology, East China Normal University, Shanghai, China
– sequence: 3
  givenname: Aoying
  surname: Zhou
  fullname: Zhou, Aoying
  email: ayzhou@dase.ecnu.edu.cn
  organization: School of Data Science and Engineering, East China Normal University, Shanghai, China
BookMark eNp9kEtPwzAQhC1UJNrCD0BcInFO8dpJ7D2iUh6iogiVs2U7jpqqOMVOJPj3pA9x4MBp57Czs_ONyMA33hFyCXQCQPFm-Xw3mzAKOGGYc8nxhAwhz2XKAGHQa5pBmvFMnJFRjGtKqRQShgQWW-eTN7fRbd34ZPbVBm33smpCMl3V3kWXvDSdT15XQUcXz8lppTfRXRznmLzfz5bTx3S-eHia3s5Ty5C3qQEwshClZCWWWlSYG81AIzUGjMtY_3VRccMLhsBkKcrCWodG8oKiASv4mFwf7m5D89m52Kp10wXfRyqWs1wgZAXtt-CwZUMTY3CV2ob6Q4dvBVTtyKgdGbUjo45keo_447F1u-_fl683_zqvDs7aOfebJJHyTEr-A46GcRs
CODEN ITKEEH
CitedBy_id crossref_primary_10_1162_dint_a_00227
crossref_primary_10_3390_sym13091742
crossref_primary_10_1109_TKDE_2023_3240851
crossref_primary_10_1016_j_ins_2023_03_089
crossref_primary_10_1109_TKDE_2023_3317139
crossref_primary_10_1109_TKDE_2022_3171690
crossref_primary_10_1007_s40747_023_01075_7
Cites_doi 10.1145/2556195.2556245
10.18653/v1/P18-2016
10.3115/v1/D14-1201
10.24963/ijcai.2018/610
10.18653/v1/D17-1123
10.1016/j.artint.2012.06.001
10.1145/219717.219745
10.18653/v1/P17-1110
10.18653/v1/D16-1236
10.18653/v1/W16-1307
10.1145/3130348.3130377
10.3115/v1/D14-1038
10.1016/j.ejor.2006.06.035
10.1145/3018661.3018662
10.18653/v1/P19-1279
10.3115/v1/E14-4003
10.18653/v1/N18-1003
10.18653/v1/P16-1187
10.1109/TKDE.2018.2866863
10.1145/1242572.1242667
10.1007/11861461_25
10.1145/3184558.3186927
10.1145/3162077
10.18653/v1/P17-1004
10.18653/v1/P18-2065
10.18653/v1/D17-1273
10.1145/3219819.3220115
10.3115/v1/P14-1113
10.18653/v1/N18-1075
10.1109/TKDE.2018.2865942
10.1162/tacl_a_00038
10.3115/v1/N15-1037
10.18653/v1/P17-1053
10.18653/v1/D17-1278
10.1145/3038912.3052708
10.1145/2213836.2213891
10.18653/v1/P17-1128
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TKDE.2019.2953839
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1558-2191
EndPage 2708
ExternalDocumentID 10_1109_TKDE_2019_2953839
8903488
Genre orig-research
GrantInformation_xml – fundername: National Key Research and Development Program of China
  grantid: 2016YFB1000904
  funderid: 10.13039/501100012166
GroupedDBID -~X
.DC
0R~
29I
4.4
5GY
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
F5P
HZ~
IEDLZ
IFIPE
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
RXW
TAE
TN5
UHB
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c293t-b11b867d82d9da7f95ba21a90bb1be421106f3b3629128d7d6cce9b83609b1c73
IEDL.DBID RIE
ISSN 1041-4347
IngestDate Mon Jun 30 05:51:56 EDT 2025
Thu Apr 24 23:03:23 EDT 2025
Tue Jul 01 01:19:36 EDT 2025
Wed Aug 27 02:29:31 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 6
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c293t-b11b867d82d9da7f95ba21a90bb1be421106f3b3629128d7d6cce9b83609b1c73
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-6911-348X
0000-0003-1010-9678
PQID 2525791460
PQPubID 85438
PageCount 16
ParticipantIDs ieee_primary_8903488
crossref_citationtrail_10_1109_TKDE_2019_2953839
proquest_journals_2525791460
crossref_primary_10_1109_TKDE_2019_2953839
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2021-06-01
PublicationDateYYYYMMDD 2021-06-01
PublicationDate_xml – month: 06
  year: 2021
  text: 2021-06-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on knowledge and data engineering
PublicationTitleAbbrev TKDE
PublicationYear 2021
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref56
ref15
ref58
ref14
ref53
ref52
ref55
ref11
li (ref19) 1989; 42
ref17
qiu (ref54) 2013
ref18
niklaus (ref9) 2018
han (ref10) 2016
banko (ref29) 2007
schmitz (ref30) 2012
wang (ref32) 2018
ref51
ref50
ref46
ref45
ref42
ref44
ref43
nastase (ref36) 2008
hsu (ref5) 2018
ref8
ref7
xavier (ref16) 2014
ref4
ref3
ref6
ref40
zeng (ref31) 2018
ref35
ref34
dagan (ref27) 2018
mikolov (ref57) 2013
ref37
ref33
huang (ref20) 1998
ref2
ref1
ref39
hendrickx (ref41) 2013
collell (ref49) 2018
saha (ref12) 2018
ref24
ref23
ref25
ref22
ref21
ref28
lyu (ref26) 2016
ref60
devlin (ref59) 2019
speer (ref47) 2012
narisawa (ref48) 2013
xu (ref13) 2016
shwartz (ref38) 2018
References_xml – year: 2013
  ident: ref57
  article-title: Efficient estimation of word representations in vector space
  publication-title: Proc 1st Int Conf Learn Representations
– ident: ref15
  doi: 10.1145/2556195.2556245
– ident: ref50
  doi: 10.18653/v1/P18-2016
– ident: ref42
  doi: 10.3115/v1/D14-1201
– ident: ref6
  doi: 10.24963/ijcai.2018/610
– start-page: 5181
  year: 2018
  ident: ref5
  article-title: An interpretable generative adversarial approach to classification of latent entity relations in unstructured sentences
  publication-title: Proc 32nd AAAI Conf Artif Intell
– start-page: 382
  year: 2013
  ident: ref48
  article-title: Is a 204 cm man tall or small ? Acquisition of numerical common sense from the web
  publication-title: Proc Annual Meeting of the Assoc Computational Linguistics
– start-page: 5658
  year: 2018
  ident: ref31
  article-title: Large scaled relation extraction with reinforcement learning
  publication-title: Proc 32nd AAAI Conf Artif Intell
– ident: ref28
  doi: 10.18653/v1/D17-1123
– start-page: 6765
  year: 2018
  ident: ref49
  article-title: Acquiring common sense spatial knowledge through implicit spatial templates
  publication-title: Proc 32nd AAAI Conf Artif Intell
– ident: ref25
  doi: 10.1016/j.artint.2012.06.001
– start-page: 96
  year: 2014
  ident: ref16
  article-title: Boosting open information extraction with noun-based relations
  publication-title: Proc 9th Int Conf Lang Resou Eval
– ident: ref46
  doi: 10.1145/219717.219745
– start-page: 218
  year: 2018
  ident: ref38
  article-title: Olive oil is made of olives, baby oil is made for babies: Interpreting noun compounds using paraphrases in a neural model
  publication-title: Proc Annu Conf North Amer Chapter Assoc Comput Linguistics
– ident: ref21
  doi: 10.18653/v1/P17-1110
– ident: ref40
  doi: 10.18653/v1/D16-1236
– ident: ref18
  doi: 10.18653/v1/W16-1307
– start-page: 3924
  year: 2016
  ident: ref13
  article-title: Learning defining features for categories
  publication-title: Proc 25th Int Joint Conf Artif Intell
– start-page: 3007
  year: 2016
  ident: ref26
  article-title: Joint word segmentation, POS-tagging and syntactic chunking
  publication-title: Proc 30th AAAI Conf Artif Intell
– ident: ref56
  doi: 10.1145/3130348.3130377
– start-page: 49
  year: 2013
  ident: ref54
  article-title: FudanNLP: A toolkit for chinese natural language processing
  publication-title: Proc Annual Meeting of the Assoc Computational Linguistics
– start-page: 496
  year: 2018
  ident: ref32
  article-title: DSGAN: Generative adversarial training for distant supervision relation extraction
  publication-title: Proc Annual Meeting of the Assoc Computational Linguistics
– ident: ref17
  doi: 10.3115/v1/D14-1038
– start-page: 1219
  year: 2008
  ident: ref36
  article-title: Decoding wikipedia categories for knowledge acquisition
  publication-title: Proc 23rd Nat Conf Artif Intell
– ident: ref53
  doi: 10.1016/j.ejor.2006.06.035
– ident: ref37
  doi: 10.1145/3018661.3018662
– ident: ref58
  doi: 10.18653/v1/P19-1279
– ident: ref43
  doi: 10.3115/v1/E14-4003
– ident: ref7
  doi: 10.18653/v1/N18-1003
– ident: ref52
  doi: 10.18653/v1/P16-1187
– ident: ref2
  doi: 10.1109/TKDE.2018.2866863
– ident: ref24
  doi: 10.1145/1242572.1242667
– start-page: 1200
  year: 2018
  ident: ref27
  article-title: Paraphrase to explicate: Revealing implicit noun-compound relations
  publication-title: Proc Annual Meeting of the Assoc Computational Linguistics
– ident: ref39
  doi: 10.1007/11861461_25
– volume: 42
  start-page: 10
  year: 1989
  ident: ref19
  article-title: Mandarin chinese: A functional reference grammar
  publication-title: The Journal of Asian Studies
– start-page: 2950
  year: 2016
  ident: ref10
  article-title: Global distant supervision for relation extraction
  publication-title: Proc 30th AAAI Conf Artif Intell
– start-page: 4171
  year: 2019
  ident: ref59
  article-title: BERT: Pre-training of deep bidirectional transformers for language understanding
  publication-title: Proc Conf North Amer Chapter Assoc Comput Linguistics
– start-page: 2288
  year: 2018
  ident: ref12
  article-title: Open information extraction from conjunctive sentences
  publication-title: Proc 27th Int Conf Comput Linguistics
– start-page: 138
  year: 2013
  ident: ref41
  article-title: SemEval-2013 task 4: Free paraphrases of noun compounds
  publication-title: Proc 2nd Joint Conf Lexical Comput Semantics Proc 7th Int Workshop Semantic Eval
– ident: ref35
  doi: 10.1145/3184558.3186927
– ident: ref44
  doi: 10.1145/3162077
– start-page: 523
  year: 2012
  ident: ref30
  article-title: Open language learning for information extraction
  publication-title: Proc Joint Conf Empirical Methods Natural Lang Process Comput Natural Lang Learn
– start-page: 3866
  year: 2018
  ident: ref9
  article-title: A survey on open information extraction
  publication-title: Proc 27th Int Conf Comput Linguistics
– ident: ref33
  doi: 10.18653/v1/P17-1004
– ident: ref34
  doi: 10.18653/v1/P18-2065
– ident: ref14
  doi: 10.18653/v1/D17-1273
– ident: ref3
  doi: 10.1145/3219819.3220115
– ident: ref22
  doi: 10.3115/v1/P14-1113
– start-page: 3679
  year: 2012
  ident: ref47
  article-title: Representing general relational knowledge in ConceptNet 5
  publication-title: Proc 8th Int Conf Lang Resources Eval
– ident: ref11
  doi: 10.18653/v1/N18-1075
– year: 1998
  ident: ref20
  publication-title: Logical relations in Chinese and the theory of grammar
– ident: ref45
  doi: 10.1109/TKDE.2018.2865942
– start-page: 2670
  year: 2007
  ident: ref29
  article-title: Open information extraction from the web
  publication-title: Proc 20th Int Joint Conf Artif Intell
– ident: ref55
  doi: 10.1162/tacl_a_00038
– ident: ref51
  doi: 10.3115/v1/N15-1037
– ident: ref4
  doi: 10.18653/v1/P17-1053
– ident: ref8
  doi: 10.18653/v1/D17-1278
– ident: ref1
  doi: 10.1145/3038912.3052708
– ident: ref60
  doi: 10.1145/2213836.2213891
– ident: ref23
  doi: 10.18653/v1/P17-1128
SSID ssj0008781
Score 2.3975964
Snippet Relation Extraction (RE) aims at harvesting relational facts from texts. A majority of existing research targets at knowledge acquisition from sentences, where...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 2693
SubjectTerms Algorithms
Electronic publishing
Encyclopedias
graph clique mining
hypergraph-based random walk
Internet
Knowledge acquisition
Magnetic heads
noun phrase segmentation
Open relation extraction
Probabilistic methods
Random walk
Semantics
Sentences
Task analysis
Title Open Relation Extraction for Chinese Noun Phrases
URI https://ieeexplore.ieee.org/document/8903488
https://www.proquest.com/docview/2525791460
Volume 33
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEB5qT3qw2ipWq-TgSdx2s69kjqItRWnx0EJvy-axCEor7RbEX2-yL4qKeMshgZCZyTeTzMwHcB3wlIZpig4mNHACTHwHjdo4rp9EyjcQq1JbOzyZRuN58LgIFw24rWthtNZ58pnu22H-l69WcmufygYcXd8o3B7sGTUrarXqW5eznJDURBcmJvIDVv5gUhcHs6eHoU3iwr6Hxr4tL_gOBuWkKj9u4hxeRi2YVBsrskpe-9tM9OXnt56N_935ERyWfia5KxTjGBp62YZWxeFASpNuw8FOQ8IOUJtfQqoEOTL8yNZF4QMxvi2xXNt6o8nU3BDk-WVtAHBzAvPRcHY_dkpSBUcaZM8cQangEVPcU6gSlmIoEo8m6ApBhQ5sPBilvjC4hga6FFORlBqFrfVAQSXzT6G5XC31GRBXhMpPOfKIy0BL46hIFqUoheCaeTLsglsdcyzLjuOW-OItziMPF2MrmdhKJi4l04Wbesl70W7jr8kde9L1xPKQu9CrZBmXBrmJPdv1FQ0suOe_r7qAfc-mq-QPLD1oZuutvjT-RiauckX7ArZIzzo
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEB5qPagHq61itWoOnsStm31mjqIt1T7w0EJvy-axCEor7RbEX2-yj1JUxFsOCYTMTL6ZZGY-gCuPJdRPErQwpp7lYexaqNXGst04kK6GWJmY2uHhKOhNvKepP63AzboWRimVJZ-pthlmf_lyLlbmqeyWoe1qhduCbY37np9Xa63vXRZmlKQ6vtBRkeuFxR8mtfF23H_omDQubDuoLdwwg2-gUEar8uMuzgCmW4NhubU8r-S1vUp5W3x-69r4370fwH7haZK7XDUOoaJmdaiVLA6kMOo67G20JGwANRkmpEyRI52PdJGXPhDt3RLDtq2Wioz0HUGeXxYaApdHMOl2xvc9q6BVsITG9tTilHIWhJI5EmUcJujz2KEx2pxTrjwTEQaJyzWyoQYvGcpACIXcVHsgpyJ0j6E6m8_UCRCb-9JNGLKACU8J7aqIMEhQcM5U6Ai_CXZ5zJEoeo4b6ou3KIs9bIyMZCIjmaiQTBOu10ve84Ybf01umJNeTywOuQmtUpZRYZLLyDF9X1EDg336-6pL2OmNh4No8Djqn8GuY5JXsueWFlTTxUqda-8j5ReZ0n0BvVfShw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Open+Relation+Extraction+for+Chinese+Noun+Phrases&rft.jtitle=IEEE+transactions+on+knowledge+and+data+engineering&rft.au=Wang%2C+Chengyu&rft.au=He%2C+Xiaofeng&rft.au=Zhou%2C+Aoying&rft.date=2021-06-01&rft.pub=IEEE&rft.issn=1041-4347&rft.volume=33&rft.issue=6&rft.spage=2693&rft.epage=2708&rft_id=info:doi/10.1109%2FTKDE.2019.2953839&rft.externalDocID=8903488
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1041-4347&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1041-4347&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1041-4347&client=summon