Tag based models for Arabic text compression

Text compression is needed to reduce the space required to store information contained in the text and the amount of time needed to transmit that information. Compression-based models such as the Prediction-by-Partial Matching (PPM) compression scheme have been found very effective for many further...

Full description

Saved in:

Bibliographic Details
Published in	2017 Intelligent Systems Conference (IntelliSys) pp. 697 - 705
Main Authors	Alkhazi, Ibrahim S., Alghamdi, Mansoor A., Teahan, William J.
Format	Conference Proceeding
Language	English
Published	IEEE 01.09.2017
Subjects	Adaptation models Arabic Compression algorithms Encoding Natural language processing Predictive models Task analysis Text compression
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Text compression is needed to reduce the space required to store information contained in the text and the amount of time needed to transmit that information. Compression-based models such as the Prediction-by-Partial Matching (PPM) compression scheme have been found very effective for many further natural language processing tasks such as authorship ascription, text categorization, and word segmentation for various languages, including English, Chinese and Arabic. Therefore, this paper explores an approach of compressing Arabic text using parts-of-speech (tags) along with the text based on the PPM compression scheme. This new approach produces significantly better compression results when compared to state-of-the-art compression algorithms for Arabic text.
DOI:	10.1109/IntelliSys.2017.8324370