Mongolian Named Entity Recognition using suffixes segmentation

Mongolian is an agglutinative language with the complex morphological structures. Building an accurate Named Entity Recognition (NER) system for Mongolian is a challenging and meaningful work. This paper analyzes the characteristic of Mongolian suffixes using Narrow Non-Break Space and investigates...

Full description

Saved in:

Bibliographic Details
Published in	2015 International Conference on Asian Language Processing (IALP) pp. 169 - 172
Main Authors	Weihua Wang, Feilong Bao, Guanglai Gao
Format	Conference Proceeding
Language	English
Published	IEEE 01.10.2015
Subjects	agglutinative language Artificial neural networks Instruments Iron mongolian Morphology named entity recognition Organizations Pragmatics
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Mongolian is an agglutinative language with the complex morphological structures. Building an accurate Named Entity Recognition (NER) system for Mongolian is a challenging and meaningful work. This paper analyzes the characteristic of Mongolian suffixes using Narrow Non-Break Space and investigates Mongolian NER system under three methods in the Condition Random Field framework. The experiment shows that segmenting each suffix into an individual token achieves the best performance than both without segmenting and using the suffixes as a feature. Our approach obtains an F-measure = 82.71. It is appropriate for the Mongolian large scale vocabulary NER. This research also makes sense to other agglutinative languages NER systems.
ISBN:	1467395951 9781467395953
DOI:	10.1109/IALP.2015.7451558