Open Relation Extraction for Chinese Noun Phrases
Relation Extraction (RE) aims at harvesting relational facts from texts. A majority of existing research targets at knowledge acquisition from sentences, where subject-verb-object structures are usually treated as the signals of existence of relations. In contrast, relational facts expressed within...
Saved in:
Published in | IEEE transactions on knowledge and data engineering Vol. 33; no. 6; pp. 2693 - 2708 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.06.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
ISSN | 1041-4347 1558-2191 |
DOI | 10.1109/TKDE.2019.2953839 |
Cover
Loading…
Abstract | Relation Extraction (RE) aims at harvesting relational facts from texts. A majority of existing research targets at knowledge acquisition from sentences, where subject-verb-object structures are usually treated as the signals of existence of relations. In contrast, relational facts expressed within noun phrases are highly implicit. Previous works mostly relies on human-compiled assertions and textual patterns in English to address noun phrase-based RE. For Chinese, the corresponding task is non-trivial because Chinese is a highly analytic language with flexible expressions. Additionally, noun phrases tend to be incomplete in grammatical structures, where clear mentions of predicates are often missing. In this article, we present an unsupervised Noun Phrase-based Open RE system for the Chinese language (NPORE), which employs a three-layer data-driven architecture. The system contains three components, i.e., Modifier-sensitive Phrase Segmenter, Candidate Relation Generator and Missing Relation Predicate Detector. It integrates with a graph clique mining algorithm to chunk Chinese noun phrases, considering how relations are expressed. We further propose a probabilistic method with knowledge priors and a hypergraph-based random walk process to detect missing relation predicates. Experiments over Chinese Wikipedia show NPORE outperforms state-of-the-art, capable of extracting 55.2 percent more relations than the most competitive baseline, with a comparable precision at 95.4 percent. |
---|---|
AbstractList | Relation Extraction (RE) aims at harvesting relational facts from texts. A majority of existing research targets at knowledge acquisition from sentences, where subject-verb-object structures are usually treated as the signals of existence of relations. In contrast, relational facts expressed within noun phrases are highly implicit. Previous works mostly relies on human-compiled assertions and textual patterns in English to address noun phrase-based RE. For Chinese, the corresponding task is non-trivial because Chinese is a highly analytic language with flexible expressions. Additionally, noun phrases tend to be incomplete in grammatical structures, where clear mentions of predicates are often missing. In this article, we present an unsupervised Noun Phrase-based Open RE system for the Chinese language (NPORE), which employs a three-layer data-driven architecture. The system contains three components, i.e., Modifier-sensitive Phrase Segmenter, Candidate Relation Generator and Missing Relation Predicate Detector. It integrates with a graph clique mining algorithm to chunk Chinese noun phrases, considering how relations are expressed. We further propose a probabilistic method with knowledge priors and a hypergraph-based random walk process to detect missing relation predicates. Experiments over Chinese Wikipedia show NPORE outperforms state-of-the-art, capable of extracting 55.2 percent more relations than the most competitive baseline, with a comparable precision at 95.4 percent. |
Author | Zhou, Aoying He, Xiaofeng Wang, Chengyu |
Author_xml | – sequence: 1 givenname: Chengyu orcidid: 0000-0003-1010-9678 surname: Wang fullname: Wang, Chengyu email: chywang2013@gmail.com organization: School of Software Engineering, East China Normal University, Shanghai, China – sequence: 2 givenname: Xiaofeng orcidid: 0000-0002-6911-348X surname: He fullname: He, Xiaofeng email: hexf@cs.ecnu.edu.cn organization: School of Computer Science and Technology, East China Normal University, Shanghai, China – sequence: 3 givenname: Aoying surname: Zhou fullname: Zhou, Aoying email: ayzhou@dase.ecnu.edu.cn organization: School of Data Science and Engineering, East China Normal University, Shanghai, China |
BookMark | eNp9kEtPwzAQhC1UJNrCD0BcInFO8dpJ7D2iUh6iogiVs2U7jpqqOMVOJPj3pA9x4MBp57Czs_ONyMA33hFyCXQCQPFm-Xw3mzAKOGGYc8nxhAwhz2XKAGHQa5pBmvFMnJFRjGtKqRQShgQWW-eTN7fRbd34ZPbVBm33smpCMl3V3kWXvDSdT15XQUcXz8lppTfRXRznmLzfz5bTx3S-eHia3s5Ty5C3qQEwshClZCWWWlSYG81AIzUGjMtY_3VRccMLhsBkKcrCWodG8oKiASv4mFwf7m5D89m52Kp10wXfRyqWs1wgZAXtt-CwZUMTY3CV2ob6Q4dvBVTtyKgdGbUjo45keo_447F1u-_fl683_zqvDs7aOfebJJHyTEr-A46GcRs |
CODEN | ITKEEH |
CitedBy_id | crossref_primary_10_1162_dint_a_00227 crossref_primary_10_3390_sym13091742 crossref_primary_10_1109_TKDE_2023_3240851 crossref_primary_10_1016_j_ins_2023_03_089 crossref_primary_10_1109_TKDE_2023_3317139 crossref_primary_10_1109_TKDE_2022_3171690 crossref_primary_10_1007_s40747_023_01075_7 |
Cites_doi | 10.1145/2556195.2556245 10.18653/v1/P18-2016 10.3115/v1/D14-1201 10.24963/ijcai.2018/610 10.18653/v1/D17-1123 10.1016/j.artint.2012.06.001 10.1145/219717.219745 10.18653/v1/P17-1110 10.18653/v1/D16-1236 10.18653/v1/W16-1307 10.1145/3130348.3130377 10.3115/v1/D14-1038 10.1016/j.ejor.2006.06.035 10.1145/3018661.3018662 10.18653/v1/P19-1279 10.3115/v1/E14-4003 10.18653/v1/N18-1003 10.18653/v1/P16-1187 10.1109/TKDE.2018.2866863 10.1145/1242572.1242667 10.1007/11861461_25 10.1145/3184558.3186927 10.1145/3162077 10.18653/v1/P17-1004 10.18653/v1/P18-2065 10.18653/v1/D17-1273 10.1145/3219819.3220115 10.3115/v1/P14-1113 10.18653/v1/N18-1075 10.1109/TKDE.2018.2865942 10.1162/tacl_a_00038 10.3115/v1/N15-1037 10.18653/v1/P17-1053 10.18653/v1/D17-1278 10.1145/3038912.3052708 10.1145/2213836.2213891 10.18653/v1/P17-1128 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021 |
DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
DOI | 10.1109/TKDE.2019.2953839 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Technology Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Computer Science |
EISSN | 1558-2191 |
EndPage | 2708 |
ExternalDocumentID | 10_1109_TKDE_2019_2953839 8903488 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Key Research and Development Program of China grantid: 2016YFB1000904 funderid: 10.13039/501100012166 |
GroupedDBID | -~X .DC 0R~ 29I 4.4 5GY 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACIWK AENEX AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD F5P HZ~ IEDLZ IFIPE IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNS RXW TAE TN5 UHB AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c293t-b11b867d82d9da7f95ba21a90bb1be421106f3b3629128d7d6cce9b83609b1c73 |
IEDL.DBID | RIE |
ISSN | 1041-4347 |
IngestDate | Mon Jun 30 05:51:56 EDT 2025 Thu Apr 24 23:03:23 EDT 2025 Tue Jul 01 01:19:36 EDT 2025 Wed Aug 27 02:29:31 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 6 |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c293t-b11b867d82d9da7f95ba21a90bb1be421106f3b3629128d7d6cce9b83609b1c73 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0002-6911-348X 0000-0003-1010-9678 |
PQID | 2525791460 |
PQPubID | 85438 |
PageCount | 16 |
ParticipantIDs | ieee_primary_8903488 crossref_citationtrail_10_1109_TKDE_2019_2953839 proquest_journals_2525791460 crossref_primary_10_1109_TKDE_2019_2953839 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2021-06-01 |
PublicationDateYYYYMMDD | 2021-06-01 |
PublicationDate_xml | – month: 06 year: 2021 text: 2021-06-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York |
PublicationTitle | IEEE transactions on knowledge and data engineering |
PublicationTitleAbbrev | TKDE |
PublicationYear | 2021 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref56 ref15 ref58 ref14 ref53 ref52 ref55 ref11 li (ref19) 1989; 42 ref17 qiu (ref54) 2013 ref18 niklaus (ref9) 2018 han (ref10) 2016 banko (ref29) 2007 schmitz (ref30) 2012 wang (ref32) 2018 ref51 ref50 ref46 ref45 ref42 ref44 ref43 nastase (ref36) 2008 hsu (ref5) 2018 ref8 ref7 xavier (ref16) 2014 ref4 ref3 ref6 ref40 zeng (ref31) 2018 ref35 ref34 dagan (ref27) 2018 mikolov (ref57) 2013 ref37 ref33 huang (ref20) 1998 ref2 ref1 ref39 hendrickx (ref41) 2013 collell (ref49) 2018 saha (ref12) 2018 ref24 ref23 ref25 ref22 ref21 ref28 lyu (ref26) 2016 ref60 devlin (ref59) 2019 speer (ref47) 2012 narisawa (ref48) 2013 xu (ref13) 2016 shwartz (ref38) 2018 |
References_xml | – year: 2013 ident: ref57 article-title: Efficient estimation of word representations in vector space publication-title: Proc 1st Int Conf Learn Representations – ident: ref15 doi: 10.1145/2556195.2556245 – ident: ref50 doi: 10.18653/v1/P18-2016 – ident: ref42 doi: 10.3115/v1/D14-1201 – ident: ref6 doi: 10.24963/ijcai.2018/610 – start-page: 5181 year: 2018 ident: ref5 article-title: An interpretable generative adversarial approach to classification of latent entity relations in unstructured sentences publication-title: Proc 32nd AAAI Conf Artif Intell – start-page: 382 year: 2013 ident: ref48 article-title: Is a 204 cm man tall or small ? Acquisition of numerical common sense from the web publication-title: Proc Annual Meeting of the Assoc Computational Linguistics – start-page: 5658 year: 2018 ident: ref31 article-title: Large scaled relation extraction with reinforcement learning publication-title: Proc 32nd AAAI Conf Artif Intell – ident: ref28 doi: 10.18653/v1/D17-1123 – start-page: 6765 year: 2018 ident: ref49 article-title: Acquiring common sense spatial knowledge through implicit spatial templates publication-title: Proc 32nd AAAI Conf Artif Intell – ident: ref25 doi: 10.1016/j.artint.2012.06.001 – start-page: 96 year: 2014 ident: ref16 article-title: Boosting open information extraction with noun-based relations publication-title: Proc 9th Int Conf Lang Resou Eval – ident: ref46 doi: 10.1145/219717.219745 – start-page: 218 year: 2018 ident: ref38 article-title: Olive oil is made of olives, baby oil is made for babies: Interpreting noun compounds using paraphrases in a neural model publication-title: Proc Annu Conf North Amer Chapter Assoc Comput Linguistics – ident: ref21 doi: 10.18653/v1/P17-1110 – ident: ref40 doi: 10.18653/v1/D16-1236 – ident: ref18 doi: 10.18653/v1/W16-1307 – start-page: 3924 year: 2016 ident: ref13 article-title: Learning defining features for categories publication-title: Proc 25th Int Joint Conf Artif Intell – start-page: 3007 year: 2016 ident: ref26 article-title: Joint word segmentation, POS-tagging and syntactic chunking publication-title: Proc 30th AAAI Conf Artif Intell – ident: ref56 doi: 10.1145/3130348.3130377 – start-page: 49 year: 2013 ident: ref54 article-title: FudanNLP: A toolkit for chinese natural language processing publication-title: Proc Annual Meeting of the Assoc Computational Linguistics – start-page: 496 year: 2018 ident: ref32 article-title: DSGAN: Generative adversarial training for distant supervision relation extraction publication-title: Proc Annual Meeting of the Assoc Computational Linguistics – ident: ref17 doi: 10.3115/v1/D14-1038 – start-page: 1219 year: 2008 ident: ref36 article-title: Decoding wikipedia categories for knowledge acquisition publication-title: Proc 23rd Nat Conf Artif Intell – ident: ref53 doi: 10.1016/j.ejor.2006.06.035 – ident: ref37 doi: 10.1145/3018661.3018662 – ident: ref58 doi: 10.18653/v1/P19-1279 – ident: ref43 doi: 10.3115/v1/E14-4003 – ident: ref7 doi: 10.18653/v1/N18-1003 – ident: ref52 doi: 10.18653/v1/P16-1187 – ident: ref2 doi: 10.1109/TKDE.2018.2866863 – ident: ref24 doi: 10.1145/1242572.1242667 – start-page: 1200 year: 2018 ident: ref27 article-title: Paraphrase to explicate: Revealing implicit noun-compound relations publication-title: Proc Annual Meeting of the Assoc Computational Linguistics – ident: ref39 doi: 10.1007/11861461_25 – volume: 42 start-page: 10 year: 1989 ident: ref19 article-title: Mandarin chinese: A functional reference grammar publication-title: The Journal of Asian Studies – start-page: 2950 year: 2016 ident: ref10 article-title: Global distant supervision for relation extraction publication-title: Proc 30th AAAI Conf Artif Intell – start-page: 4171 year: 2019 ident: ref59 article-title: BERT: Pre-training of deep bidirectional transformers for language understanding publication-title: Proc Conf North Amer Chapter Assoc Comput Linguistics – start-page: 2288 year: 2018 ident: ref12 article-title: Open information extraction from conjunctive sentences publication-title: Proc 27th Int Conf Comput Linguistics – start-page: 138 year: 2013 ident: ref41 article-title: SemEval-2013 task 4: Free paraphrases of noun compounds publication-title: Proc 2nd Joint Conf Lexical Comput Semantics Proc 7th Int Workshop Semantic Eval – ident: ref35 doi: 10.1145/3184558.3186927 – ident: ref44 doi: 10.1145/3162077 – start-page: 523 year: 2012 ident: ref30 article-title: Open language learning for information extraction publication-title: Proc Joint Conf Empirical Methods Natural Lang Process Comput Natural Lang Learn – start-page: 3866 year: 2018 ident: ref9 article-title: A survey on open information extraction publication-title: Proc 27th Int Conf Comput Linguistics – ident: ref33 doi: 10.18653/v1/P17-1004 – ident: ref34 doi: 10.18653/v1/P18-2065 – ident: ref14 doi: 10.18653/v1/D17-1273 – ident: ref3 doi: 10.1145/3219819.3220115 – ident: ref22 doi: 10.3115/v1/P14-1113 – start-page: 3679 year: 2012 ident: ref47 article-title: Representing general relational knowledge in ConceptNet 5 publication-title: Proc 8th Int Conf Lang Resources Eval – ident: ref11 doi: 10.18653/v1/N18-1075 – year: 1998 ident: ref20 publication-title: Logical relations in Chinese and the theory of grammar – ident: ref45 doi: 10.1109/TKDE.2018.2865942 – start-page: 2670 year: 2007 ident: ref29 article-title: Open information extraction from the web publication-title: Proc 20th Int Joint Conf Artif Intell – ident: ref55 doi: 10.1162/tacl_a_00038 – ident: ref51 doi: 10.3115/v1/N15-1037 – ident: ref4 doi: 10.18653/v1/P17-1053 – ident: ref8 doi: 10.18653/v1/D17-1278 – ident: ref1 doi: 10.1145/3038912.3052708 – ident: ref60 doi: 10.1145/2213836.2213891 – ident: ref23 doi: 10.18653/v1/P17-1128 |
SSID | ssj0008781 |
Score | 2.3975964 |
Snippet | Relation Extraction (RE) aims at harvesting relational facts from texts. A majority of existing research targets at knowledge acquisition from sentences, where... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 2693 |
SubjectTerms | Algorithms Electronic publishing Encyclopedias graph clique mining hypergraph-based random walk Internet Knowledge acquisition Magnetic heads noun phrase segmentation Open relation extraction Probabilistic methods Random walk Semantics Sentences Task analysis |
Title | Open Relation Extraction for Chinese Noun Phrases |
URI | https://ieeexplore.ieee.org/document/8903488 https://www.proquest.com/docview/2525791460 |
Volume | 33 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEB5qT3qw2ipWq-TgSdx2s69kjqItRWnx0EJvy-axCEor7RbEX2-yL4qKeMshgZCZyTeTzMwHcB3wlIZpig4mNHACTHwHjdo4rp9EyjcQq1JbOzyZRuN58LgIFw24rWthtNZ58pnu22H-l69WcmufygYcXd8o3B7sGTUrarXqW5eznJDURBcmJvIDVv5gUhcHs6eHoU3iwr6Hxr4tL_gOBuWkKj9u4hxeRi2YVBsrskpe-9tM9OXnt56N_935ERyWfia5KxTjGBp62YZWxeFASpNuw8FOQ8IOUJtfQqoEOTL8yNZF4QMxvi2xXNt6o8nU3BDk-WVtAHBzAvPRcHY_dkpSBUcaZM8cQangEVPcU6gSlmIoEo8m6ApBhQ5sPBilvjC4hga6FFORlBqFrfVAQSXzT6G5XC31GRBXhMpPOfKIy0BL46hIFqUoheCaeTLsglsdcyzLjuOW-OItziMPF2MrmdhKJi4l04Wbesl70W7jr8kde9L1xPKQu9CrZBmXBrmJPdv1FQ0suOe_r7qAfc-mq-QPLD1oZuutvjT-RiauckX7ArZIzzo |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEB5qPagHq61itWoOnsStm31mjqIt1T7w0EJvy-axCEor7RbEX2-yj1JUxFsOCYTMTL6ZZGY-gCuPJdRPErQwpp7lYexaqNXGst04kK6GWJmY2uHhKOhNvKepP63AzboWRimVJZ-pthlmf_lyLlbmqeyWoe1qhduCbY37np9Xa63vXRZmlKQ6vtBRkeuFxR8mtfF23H_omDQubDuoLdwwg2-gUEar8uMuzgCmW4NhubU8r-S1vUp5W3x-69r4370fwH7haZK7XDUOoaJmdaiVLA6kMOo67G20JGwANRkmpEyRI52PdJGXPhDt3RLDtq2Wioz0HUGeXxYaApdHMOl2xvc9q6BVsITG9tTilHIWhJI5EmUcJujz2KEx2pxTrjwTEQaJyzWyoQYvGcpACIXcVHsgpyJ0j6E6m8_UCRCb-9JNGLKACU8J7aqIMEhQcM5U6Ai_CXZ5zJEoeo4b6ou3KIs9bIyMZCIjmaiQTBOu10ve84Ybf01umJNeTywOuQmtUpZRYZLLyDF9X1EDg336-6pL2OmNh4No8Djqn8GuY5JXsueWFlTTxUqda-8j5ReZ0n0BvVfShw |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Open+Relation+Extraction+for+Chinese+Noun+Phrases&rft.jtitle=IEEE+transactions+on+knowledge+and+data+engineering&rft.au=Wang%2C+Chengyu&rft.au=He%2C+Xiaofeng&rft.au=Zhou%2C+Aoying&rft.date=2021-06-01&rft.pub=IEEE&rft.issn=1041-4347&rft.volume=33&rft.issue=6&rft.spage=2693&rft.epage=2708&rft_id=info:doi/10.1109%2FTKDE.2019.2953839&rft.externalDocID=8903488 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1041-4347&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1041-4347&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1041-4347&client=summon |