Indonesian News Articles Summarization Using Genetic Algorithm

Extractive text summarization consists of selecting the most important sentences from the original text. By summarizing the contents of the article, readers might be able to understand the article more easily and faster than reading the entire article. The process of summarizing involves gathering a...

Full description

Saved in:

Bibliographic Details
Published in	Engineering letters Vol. 30; no. 1; p. 152
Main Authors	Khotimah, Nurul, Girsang, Abba Suganda
Format	Journal Article
Language	English
Published	Hong Kong International Association of Engineers 24.02.2022
Subjects	Documents Genetic algorithms Optimization Similarity Indonesia
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Extractive text summarization consists of selecting the most important sentences from the original text. By summarizing the contents of the article, readers might be able to understand the article more easily and faster than reading the entire article. The process of summarizing involves gathering as much as possible of the information and presenting only the most important details as succinctly as possible. To solve that problem, a genetic algorithm will be adopted to extract sentences as a summary. The summarization process is considered as an optimization problem where the optimal summary is selected from a series of sentences from the original document. Genetic algorithm used to optimize sentence selection to obtain a summary that represents the main idea of the source document where the compression rate determines the number of sentences selected as summary. To represents the text and capture the interconnects between sentences, a graph will be constructed and given a weight with PageRank score. 60 news articles in Bahasa Indonesia from IndoSum are used as a dataset. To evaluate how good the results are, ROUGE-1 and cosine similarity are calculated to compare the summary generated by the system and reference summary. This study also set up 5 comparisons to other methods such as SumBasic, LexRank, LSA, TextRank, and KLSum. Evaluation results yield better summary results compare to other methods with the average ROUGE-1 score 0.641 on recall and cosine similarity 0.625 for compression rate of 30%.
ISSN:	1816-093X 1816-0948