Tag based models for Arabic text compression
Text compression is needed to reduce the space required to store information contained in the text and the amount of time needed to transmit that information. Compression-based models such as the Prediction-by-Partial Matching (PPM) compression scheme have been found very effective for many further...
Saved in:
Published in | 2017 Intelligent Systems Conference (IntelliSys) pp. 697 - 705 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.09.2017
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Text compression is needed to reduce the space required to store information contained in the text and the amount of time needed to transmit that information. Compression-based models such as the Prediction-by-Partial Matching (PPM) compression scheme have been found very effective for many further natural language processing tasks such as authorship ascription, text categorization, and word segmentation for various languages, including English, Chinese and Arabic. Therefore, this paper explores an approach of compressing Arabic text using parts-of-speech (tags) along with the text based on the PPM compression scheme. This new approach produces significantly better compression results when compared to state-of-the-art compression algorithms for Arabic text. |
---|---|
DOI: | 10.1109/IntelliSys.2017.8324370 |