Learning Morpheme Representation for Mongolian Named Entity Recognition

Traditional approaches to Mongolian named entity recognition heavily rely on the feature engineering. Even worse, the complex morphological structure of Mongolian words made the data more sparsity. To alleviate the feature engineering and data sparsity in Mongolian named entity recognition, we propo...

Full description

Saved in:

Bibliographic Details
Published in	Neural processing letters Vol. 50; no. 3; pp. 2647 - 2664
Main Authors	Wang, Weihua, Bao, Feilong, Gao, Guanglai
Format	Journal Article
Language	English
Published	New York Springer US 01.12.2019 Springer Nature B.V
Subjects	Annotations Artificial Intelligence Complex Systems Computational Intelligence Computer Science Conditional random fields Knowledge Labeling Language Learning Machine translation Morphology Natural language processing Neural networks Recurrent neural networks Representations Semantics Sparsity Mongolia China Russia Language model auxiliary loss Mongolian morpheme representation Named entity recognition
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Traditional approaches to Mongolian named entity recognition heavily rely on the feature engineering. Even worse, the complex morphological structure of Mongolian words made the data more sparsity. To alleviate the feature engineering and data sparsity in Mongolian named entity recognition, we propose a framework of recurrent neural networks with morpheme representation. We then study this framework in depth with different model variants. More specially, the morpheme representation utilizes the characteristic of classical Mongolian script, which can be learned from unsupervised corpus. Our model will be further augmented by different character representations and auxiliary language model losses which will extract context knowledge from scratch. By jointly decoding by Conditional Random Field layer, the model could learn the dependence between different labels. Experimental results show that feeding the morpheme representation into neural networks outperforms the word representation. The additional character representation and morpheme language model loss also improve the performance.
ISSN:	1370-4621 1573-773X
DOI:	10.1007/s11063-019-10044-6