Information Extraction Method based on Dilated Convolution and Character-Enhanced Word Embedding

With the establishment and application of various kinds of knowledge graphs, information extraction has become one of the most important tasks in natural language processing. Due to the complexity of the Chinese language, traditional pipeline and joint extraction methods cannot solve entity confusio...

Full description

Saved in:

Bibliographic Details
Published in	2020 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC) pp. 138 - 143
Main Authors	He, Zhaorong, Luo, Xiaonan, Zhong, Yanru, Jiang, Chaohao, Zhao, Leixian
Format	Conference Proceeding
Language	English
Published	IEEE 01.10.2020
Subjects	character-enhanced word embedding Chinese natural language processing Computational modeling Convolution Data mining dilated convolution Information extraction Information retrieval Natural language processing self-attention Stability analysis Task analysis
Online Access	Get full text

Cover

Loading…

Abstract	With the establishment and application of various kinds of knowledge graphs, information extraction has become one of the most important tasks in natural language processing. Due to the complexity of the Chinese language, traditional pipeline and joint extraction methods cannot solve entity confusion in most cases when dealing with one-to-many entity relationships. In order to solve this problem, this paper presents an idea based on seq2seq decoding, which directly predicts two object entities of a triple through the subject entity, uses the method of dilated convolution and character-enhanced word embedding, and adds self-attention to optimize the encoding process of Chinese words. Not only simplifies the extraction process, but also solves the entity conflict when facing the one-to-many entity relationship extraction problem. Experiments show that our proposed method performs better on datasets than traditional relational extraction models and has better scalability.
AbstractList	With the establishment and application of various kinds of knowledge graphs, information extraction has become one of the most important tasks in natural language processing. Due to the complexity of the Chinese language, traditional pipeline and joint extraction methods cannot solve entity confusion in most cases when dealing with one-to-many entity relationships. In order to solve this problem, this paper presents an idea based on seq2seq decoding, which directly predicts two object entities of a triple through the subject entity, uses the method of dilated convolution and character-enhanced word embedding, and adds self-attention to optimize the encoding process of Chinese words. Not only simplifies the extraction process, but also solves the entity conflict when facing the one-to-many entity relationship extraction problem. Experiments show that our proposed method performs better on datasets than traditional relational extraction models and has better scalability.
Author	Zhao, Leixian He, Zhaorong Jiang, Chaohao Luo, Xiaonan Zhong, Yanru
Author_xml	– sequence: 1 givenname: Zhaorong surname: He fullname: He, Zhaorong email: zhaorong_he@outlook.com organization: Guilin University of Electronic Technology,Guilin,China – sequence: 2 givenname: Xiaonan surname: Luo fullname: Luo, Xiaonan email: luoxn@guet.edu.cn organization: Guilin University of Electronic Technology,Guilin,China – sequence: 3 givenname: Yanru surname: Zhong fullname: Zhong, Yanru email: Rosezhong@guet.edu.cn organization: Guilin University of Electronic Technology,Guilin,China – sequence: 4 givenname: Chaohao surname: Jiang fullname: Jiang, Chaohao email: 2009853ZII30002@student.must.edu.mo organization: Macau University of Science and Technology,Faculty of Information Technology,Macau,China – sequence: 5 givenname: Leixian surname: Zhao fullname: Zhao, Leixian email: LeixianZhao@outlook.com organization: Guilin University of Electronic Technology,Guilin,China
BookMark	eNotjE1OwzAUhI0EC1o4ARLKBVL8l9heohBopSI2IJblOX4hlhIbuQHR22MKi9HMSN_MgpyGGJCQa0ZXjFFz0xwspkYaVakVp5yuKKWCnZAFU1wzLaVW5-RtE_qYJph9DEX7PSfojvER5yG6wsIeXZH7nR9hzrGJ4SuOn0cGQu4D_E4wlW0YIHQZeY3JFe1k0Tkf3i_IWQ_jHi__fUle7tvnZl1unx42ze229IzpudS9Q2V5neWE6TvZGds5CdxUFVe1UlCBEZZbYWvJJDUMTC2slsgoUGbFklz9_XpE3H0kP0E67IzgJuPiB0KAUxg
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/CyberC49757.2020.00031
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	1728184487 9781728184487
EndPage	143
ExternalDocumentID	9329414
Genre	orig-research
GrantInformation_xml	– fundername: National Natural Science Foundation of China grantid: 61562016 funderid: 10.13039/501100001809
GroupedDBID	6IE 6IL CBEJK RIE RIL
ID	FETCH-LOGICAL-i118t-8fde7b267b2d39fc4c9bcd4a295527677a5a93b2b3b6414091a963b84e10a01b3
IEDL.DBID	RIE
IngestDate	Thu Jun 29 18:38:59 EDT 2023
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i118t-8fde7b267b2d39fc4c9bcd4a295527677a5a93b2b3b6414091a963b84e10a01b3
PageCount	6
ParticipantIDs	ieee_primary_9329414
PublicationCentury	2000
PublicationDate	2020-Oct.
PublicationDateYYYYMMDD	2020-10-01
PublicationDate_xml	– month: 10 year: 2020 text: 2020-Oct.
PublicationDecade	2020
PublicationTitle	2020 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC)
PublicationTitleAbbrev	CYBERC
PublicationYear	2020
Publisher	IEEE
Publisher_xml	– name: IEEE
Score	1.7382786
Snippet	With the establishment and application of various kinds of knowledge graphs, information extraction has become one of the most important tasks in natural...
SourceID	ieee
SourceType	Publisher
StartPage	138
SubjectTerms	character-enhanced word embedding Chinese natural language processing Computational modeling Convolution Data mining dilated convolution Information extraction Information retrieval Natural language processing self-attention Stability analysis Task analysis
Title	Information Extraction Method based on Dilated Convolution and Character-Enhanced Word Embedding
URI	https://ieeexplore.ieee.org/document/9329414
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELVKJyZALeJbHhhxG8eOHc8lVYVUxEBFt-KvigoIqEoR8Os5O2lBiIEhUpwhju6SnF9y7z2Ezi0XQlJqiVPSEw6AgGhKNSQkc5n1RhsdCM7jazGa8KtpNm2hiw0Xxnsfm898L-zGf_nuxa7Cp7I-rDVUdK3eAuBWc7Ua0i9NVH_wYfxywJXMJOC-NLRsJcE77odrSiwawx00Xk9X94o89laV6dnPX0qM_72eXdT9pufhm03h2UMtX3bQfUMsCoHGxXu1rBkLeBwtonGoVg7D-HLxBMtLh-Fcb81th3UJ47VyMynKh9gWgO8AmOLi2XgX5umiybC4HYxIY59AFoAaKpLPnZcmFbA5puaWW2Ws4zpVQXVNSKkzrZhJDTOCB90rquFpNDn3NNEJNWwftcuX0h8grIMqu5E5c4JxL2iuLbNCpYB15hrekIeoE6Ize60VMmZNYI7-PnyMtkN-6pa4E9Sulit_CqW9Mmcxp19D0Kb8
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT4NAEN009aAnNa3x2z14lJaFZZc9V5qqpfHQxt7qfjU2VmoaatRf7yzQaowHDyQsB5bMAG8H3nuD0KWmjHFCtGcEtx6FgsCThEhISGQibZVU0gmc0wHrjejtOBrX0NVGC2OtLchntuV2i3_5ZqFX7lNZG9YaouhavQW4H5FSrVXJfokv2p0PZZcdKnjEofILHGnLd93jfvRNKWCju4vS9YQlW-S5tcpVS3_-8mL87xXtoea3QA_fb6BnH9Vs1kCPlbTIhRon7_my1CzgtGgSjR1eGQzj69kcFpgGw7neqhsPywzGa-9mL8meCmIAfoDSFCcvyho3TxONusmw0_OqBgreDOqG3IunxnIVMNhMKKaaaqG0oTIQzneNcS4jKUIVqFAx6pyviITnUcXUEl_6RIUHqJ4tMnuIsHS-7IrHoWEhtYzEUoeaiQCqnamEd-QRarjoTF5Lj4xJFZjjvw9foO3eMO1P-jeDuxO043JVEuROUT1fruwZAH2uzov8fgEoNapF
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2020+International+Conference+on+Cyber-Enabled+Distributed+Computing+and+Knowledge+Discovery+%28CyberC%29&rft.atitle=Information+Extraction+Method+based+on+Dilated+Convolution+and+Character-Enhanced+Word+Embedding&rft.au=He%2C+Zhaorong&rft.au=Luo%2C+Xiaonan&rft.au=Zhong%2C+Yanru&rft.au=Jiang%2C+Chaohao&rft.date=2020-10-01&rft.pub=IEEE&rft.spage=138&rft.epage=143&rft_id=info:doi/10.1109%2FCyberC49757.2020.00031&rft.externalDocID=9329414