Information Extraction Method based on Dilated Convolution and Character-Enhanced Word Embedding

With the establishment and application of various kinds of knowledge graphs, information extraction has become one of the most important tasks in natural language processing. Due to the complexity of the Chinese language, traditional pipeline and joint extraction methods cannot solve entity confusio...

Full description

Saved in:

Bibliographic Details
Published in	2020 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC) pp. 138 - 143
Main Authors	He, Zhaorong, Luo, Xiaonan, Zhong, Yanru, Jiang, Chaohao, Zhao, Leixian
Format	Conference Proceeding
Language	English
Published	IEEE 01.10.2020
Subjects	character-enhanced word embedding Chinese natural language processing Computational modeling Convolution Data mining dilated convolution Information extraction Information retrieval Natural language processing self-attention Stability analysis Task analysis
Online Access	Get full text

Cover

Loading…

More Information
Summary:	With the establishment and application of various kinds of knowledge graphs, information extraction has become one of the most important tasks in natural language processing. Due to the complexity of the Chinese language, traditional pipeline and joint extraction methods cannot solve entity confusion in most cases when dealing with one-to-many entity relationships. In order to solve this problem, this paper presents an idea based on seq2seq decoding, which directly predicts two object entities of a triple through the subject entity, uses the method of dilated convolution and character-enhanced word embedding, and adds self-attention to optimize the encoding process of Chinese words. Not only simplifies the extraction process, but also solves the entity conflict when facing the one-to-many entity relationship extraction problem. Experiments show that our proposed method performs better on datasets than traditional relational extraction models and has better scalability.
DOI:	10.1109/CyberC49757.2020.00031