Generating a Lexicon for the Hijazi Dialect in Arabic

We present a methodology for creating a lexicon for a low-resource Arabic dialect in Saudi Arabia: Hijazi. We show the differences between the Hijazi dialect and Modern Standard Arabic. We annotate articles and tweets using recruited native speakers. We create a lexicon of Hijazi adapted from two re...

Full description

Saved in:
Bibliographic Details
Published inArabic Language Processing: From Theory to Practice pp. 3 - 17
Main Authors Alqahtani, Fatimah Abdullah, Sanderson, Mark
Format Book Chapter
LanguageEnglish
Published Cham Springer International Publishing
SeriesCommunications in Computer and Information Science
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We present a methodology for creating a lexicon for a low-resource Arabic dialect in Saudi Arabia: Hijazi. We show the differences between the Hijazi dialect and Modern Standard Arabic. We annotate articles and tweets using recruited native speakers. We create a lexicon of Hijazi adapted from two resources: Sebawai and Quranic Arabic Corpus. The lexicon is created both manually and automatically by using Hijazi morphology. We detail the methodology to build this lexicon and present results of an evaluation of the corpus formation process.
ISBN:3030329585
9783030329587
ISSN:1865-0929
1865-0937
DOI:10.1007/978-3-030-32959-4_1