Named Entity Recognition for Code-Mixed Indian Corpus using Meta Embedding
In this paper, we utilize the pre-trained embedding, sub-word embedding and closely related languages of languages in the code mixed corpus to create a meta-embedding. We then use the Transformer to encode the code mixed sentence and use Conditional Random Field to predict the Named Entities in the...
Saved in:
Published in | International Conference on Advanced Computing and Communication Systems (Online) pp. 68 - 72 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.03.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In this paper, we utilize the pre-trained embedding, sub-word embedding and closely related languages of languages in the code mixed corpus to create a meta-embedding. We then use the Transformer to encode the code mixed sentence and use Conditional Random Field to predict the Named Entities in the code-mixed text. In contrast to classical Named Entity recognition where the text is monolingual, our approach can predict the Named Entities in code-mixed corpus written both in the native script as well as Roman script. Our method is a novel method to combine the embeddings of closely related languages to identify Named Entity from Code-Mixed Indian text written using native script and Roman script in social media. |
---|---|
ISBN: | 1728151961 9781728151960 |
ISSN: | 2575-7288 |
DOI: | 10.1109/ICACCS48705.2020.9074379 |