Learning Morpheme Representation for Mongolian Named Entity Recognition
Traditional approaches to Mongolian named entity recognition heavily rely on the feature engineering. Even worse, the complex morphological structure of Mongolian words made the data more sparsity. To alleviate the feature engineering and data sparsity in Mongolian named entity recognition, we propo...
Saved in:
Published in | Neural processing letters Vol. 50; no. 3; pp. 2647 - 2664 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
New York
Springer US
01.12.2019
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Traditional approaches to Mongolian named entity recognition heavily rely on the feature engineering. Even worse, the complex morphological structure of Mongolian words made the data more sparsity. To alleviate the feature engineering and data sparsity in Mongolian named entity recognition, we propose a framework of recurrent neural networks with morpheme representation. We then study this framework in depth with different model variants. More specially, the morpheme representation utilizes the characteristic of classical Mongolian script, which can be learned from unsupervised corpus. Our model will be further augmented by different character representations and auxiliary language model losses which will extract context knowledge from scratch. By jointly decoding by Conditional Random Field layer, the model could learn the dependence between different labels. Experimental results show that feeding the morpheme representation into neural networks outperforms the word representation. The additional character representation and morpheme language model loss also improve the performance. |
---|---|
ISSN: | 1370-4621 1573-773X |
DOI: | 10.1007/s11063-019-10044-6 |