Information Extraction Method based on Dilated Convolution and Character-Enhanced Word Embedding
With the establishment and application of various kinds of knowledge graphs, information extraction has become one of the most important tasks in natural language processing. Due to the complexity of the Chinese language, traditional pipeline and joint extraction methods cannot solve entity confusio...
Saved in:
Published in | 2020 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC) pp. 138 - 143 |
---|---|
Main Authors | , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.10.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | With the establishment and application of various kinds of knowledge graphs, information extraction has become one of the most important tasks in natural language processing. Due to the complexity of the Chinese language, traditional pipeline and joint extraction methods cannot solve entity confusion in most cases when dealing with one-to-many entity relationships. In order to solve this problem, this paper presents an idea based on seq2seq decoding, which directly predicts two object entities of a triple through the subject entity, uses the method of dilated convolution and character-enhanced word embedding, and adds self-attention to optimize the encoding process of Chinese words. Not only simplifies the extraction process, but also solves the entity conflict when facing the one-to-many entity relationship extraction problem. Experiments show that our proposed method performs better on datasets than traditional relational extraction models and has better scalability. |
---|---|
AbstractList | With the establishment and application of various kinds of knowledge graphs, information extraction has become one of the most important tasks in natural language processing. Due to the complexity of the Chinese language, traditional pipeline and joint extraction methods cannot solve entity confusion in most cases when dealing with one-to-many entity relationships. In order to solve this problem, this paper presents an idea based on seq2seq decoding, which directly predicts two object entities of a triple through the subject entity, uses the method of dilated convolution and character-enhanced word embedding, and adds self-attention to optimize the encoding process of Chinese words. Not only simplifies the extraction process, but also solves the entity conflict when facing the one-to-many entity relationship extraction problem. Experiments show that our proposed method performs better on datasets than traditional relational extraction models and has better scalability. |
Author | Zhao, Leixian He, Zhaorong Jiang, Chaohao Luo, Xiaonan Zhong, Yanru |
Author_xml | – sequence: 1 givenname: Zhaorong surname: He fullname: He, Zhaorong email: zhaorong_he@outlook.com organization: Guilin University of Electronic Technology,Guilin,China – sequence: 2 givenname: Xiaonan surname: Luo fullname: Luo, Xiaonan email: luoxn@guet.edu.cn organization: Guilin University of Electronic Technology,Guilin,China – sequence: 3 givenname: Yanru surname: Zhong fullname: Zhong, Yanru email: Rosezhong@guet.edu.cn organization: Guilin University of Electronic Technology,Guilin,China – sequence: 4 givenname: Chaohao surname: Jiang fullname: Jiang, Chaohao email: 2009853ZII30002@student.must.edu.mo organization: Macau University of Science and Technology,Faculty of Information Technology,Macau,China – sequence: 5 givenname: Leixian surname: Zhao fullname: Zhao, Leixian email: LeixianZhao@outlook.com organization: Guilin University of Electronic Technology,Guilin,China |
BookMark | eNotjE1OwzAUhI0EC1o4ARLKBVL8l9heohBopSI2IJblOX4hlhIbuQHR22MKi9HMSN_MgpyGGJCQa0ZXjFFz0xwspkYaVakVp5yuKKWCnZAFU1wzLaVW5-RtE_qYJph9DEX7PSfojvER5yG6wsIeXZH7nR9hzrGJ4SuOn0cGQu4D_E4wlW0YIHQZeY3JFe1k0Tkf3i_IWQ_jHi__fUle7tvnZl1unx42ze229IzpudS9Q2V5neWE6TvZGds5CdxUFVe1UlCBEZZbYWvJJDUMTC2slsgoUGbFklz9_XpE3H0kP0E67IzgJuPiB0KAUxg |
CODEN | IEEPAD |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/CyberC49757.2020.00031 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 1728184487 9781728184487 |
EndPage | 143 |
ExternalDocumentID | 9329414 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 61562016 funderid: 10.13039/501100001809 |
GroupedDBID | 6IE 6IL CBEJK RIE RIL |
ID | FETCH-LOGICAL-i118t-8fde7b267b2d39fc4c9bcd4a295527677a5a93b2b3b6414091a963b84e10a01b3 |
IEDL.DBID | RIE |
IngestDate | Thu Jun 29 18:38:59 EDT 2023 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i118t-8fde7b267b2d39fc4c9bcd4a295527677a5a93b2b3b6414091a963b84e10a01b3 |
PageCount | 6 |
ParticipantIDs | ieee_primary_9329414 |
PublicationCentury | 2000 |
PublicationDate | 2020-Oct. |
PublicationDateYYYYMMDD | 2020-10-01 |
PublicationDate_xml | – month: 10 year: 2020 text: 2020-Oct. |
PublicationDecade | 2020 |
PublicationTitle | 2020 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC) |
PublicationTitleAbbrev | CYBERC |
PublicationYear | 2020 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
Score | 1.7382786 |
Snippet | With the establishment and application of various kinds of knowledge graphs, information extraction has become one of the most important tasks in natural... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 138 |
SubjectTerms | character-enhanced word embedding Chinese natural language processing Computational modeling Convolution Data mining dilated convolution Information extraction Information retrieval Natural language processing self-attention Stability analysis Task analysis |
Title | Information Extraction Method based on Dilated Convolution and Character-Enhanced Word Embedding |
URI | https://ieeexplore.ieee.org/document/9329414 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELVKJyZALeJbHhhxG8eOHc8lVYVUxEBFt-KvigoIqEoR8Os5O2lBiIEhUpwhju6SnF9y7z2Ezi0XQlJqiVPSEw6AgGhKNSQkc5n1RhsdCM7jazGa8KtpNm2hiw0Xxnsfm898L-zGf_nuxa7Cp7I-rDVUdK3eAuBWc7Ua0i9NVH_wYfxywJXMJOC-NLRsJcE77odrSiwawx00Xk9X94o89laV6dnPX0qM_72eXdT9pufhm03h2UMtX3bQfUMsCoHGxXu1rBkLeBwtonGoVg7D-HLxBMtLh-Fcb81th3UJ47VyMynKh9gWgO8AmOLi2XgX5umiybC4HYxIY59AFoAaKpLPnZcmFbA5puaWW2Ws4zpVQXVNSKkzrZhJDTOCB90rquFpNDn3NNEJNWwftcuX0h8grIMqu5E5c4JxL2iuLbNCpYB15hrekIeoE6Ize60VMmZNYI7-PnyMtkN-6pa4E9Sulit_CqW9Mmcxp19D0Kb8 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT4NAEN009aAnNa3x2z14lJaFZZc9V5qqpfHQxt7qfjU2VmoaatRf7yzQaowHDyQsB5bMAG8H3nuD0KWmjHFCtGcEtx6FgsCThEhISGQibZVU0gmc0wHrjejtOBrX0NVGC2OtLchntuV2i3_5ZqFX7lNZG9YaouhavQW4H5FSrVXJfokv2p0PZZcdKnjEofILHGnLd93jfvRNKWCju4vS9YQlW-S5tcpVS3_-8mL87xXtoea3QA_fb6BnH9Vs1kCPlbTIhRon7_my1CzgtGgSjR1eGQzj69kcFpgGw7neqhsPywzGa-9mL8meCmIAfoDSFCcvyho3TxONusmw0_OqBgreDOqG3IunxnIVMNhMKKaaaqG0oTIQzneNcS4jKUIVqFAx6pyviITnUcXUEl_6RIUHqJ4tMnuIsHS-7IrHoWEhtYzEUoeaiQCqnamEd-QRarjoTF5Lj4xJFZjjvw9foO3eMO1P-jeDuxO043JVEuROUT1fruwZAH2uzov8fgEoNapF |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2020+International+Conference+on+Cyber-Enabled+Distributed+Computing+and+Knowledge+Discovery+%28CyberC%29&rft.atitle=Information+Extraction+Method+based+on+Dilated+Convolution+and+Character-Enhanced+Word+Embedding&rft.au=He%2C+Zhaorong&rft.au=Luo%2C+Xiaonan&rft.au=Zhong%2C+Yanru&rft.au=Jiang%2C+Chaohao&rft.date=2020-10-01&rft.pub=IEEE&rft.spage=138&rft.epage=143&rft_id=info:doi/10.1109%2FCyberC49757.2020.00031&rft.externalDocID=9329414 |