Generating a Lexicon for the Hijazi Dialect in Arabic
We present a methodology for creating a lexicon for a low-resource Arabic dialect in Saudi Arabia: Hijazi. We show the differences between the Hijazi dialect and Modern Standard Arabic. We annotate articles and tweets using recruited native speakers. We create a lexicon of Hijazi adapted from two re...
Saved in:
Published in | Arabic Language Processing: From Theory to Practice pp. 3 - 17 |
---|---|
Main Authors | , |
Format | Book Chapter |
Language | English |
Published |
Cham
Springer International Publishing
|
Series | Communications in Computer and Information Science |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | We present a methodology for creating a lexicon for a low-resource Arabic dialect in Saudi Arabia: Hijazi. We show the differences between the Hijazi dialect and Modern Standard Arabic. We annotate articles and tweets using recruited native speakers. We create a lexicon of Hijazi adapted from two resources: Sebawai and Quranic Arabic Corpus. The lexicon is created both manually and automatically by using Hijazi morphology. We detail the methodology to build this lexicon and present results of an evaluation of the corpus formation process. |
---|---|
ISBN: | 3030329585 9783030329587 |
ISSN: | 1865-0929 1865-0937 |
DOI: | 10.1007/978-3-030-32959-4_1 |