Arabic keyphrases extraction using a hybrid of statistical and machine learning methods

Keyphrases are single-word or multi-word lexemes that concisely and accurate describe the subject or side of the subject discuss in a document. Manually assigning keyphrases is tedious and time consuming, especially because of Web proliferation. Thus, automatic keyphrase generation systems are urgen...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the 6th International Conference on Information Technology and Multimedia pp. 281 - 286
Main Authors Ali, Nidaa Ghalib, Omar, Nazlia
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.11.2014
Subjects
Online AccessGet full text
DOI10.1109/ICIMU.2014.7066645

Cover

Loading…
More Information
Summary:Keyphrases are single-word or multi-word lexemes that concisely and accurate describe the subject or side of the subject discuss in a document. Manually assigning keyphrases is tedious and time consuming, especially because of Web proliferation. Thus, automatic keyphrase generation systems are urgently needed. This study proposes a keyphrase extraction method that combines several keyphrase extraction methods with the use of machine learning approaches (linear logistic regression, linear discriminant analysis, and support vector machines). The proposed methods use the output of several keyphrase extraction methods as input features for a machine learning algorithm, which then determines whether each term is a keyphrase. Results show that the SVM algorithm achieves the best performance with F1-measures 88.31%. These values are relatively high and comparable with those of previous keyphrase extraction models for the Arabic language.
DOI:10.1109/ICIMU.2014.7066645