Performance Comparison of Rabin-Karp Algorithm and Winnowing Algorithm for Document Abstraction Similarity Detection

Plagiarism is one of the actions that have the potential to occur in the academic environment. A real example is the plagiarism of assignments during the lecture process. One of the efforts to prevent plagiarism is to use a system that has been developed in recent years. However, the price of subscr...

Full description

Saved in:
Bibliographic Details
Published in2022 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS) pp. 281 - 286
Main Authors Dwi Hartanto, Anggit, Pristyanto, Yoga, Saputra, Andy, Pujastuti, Eli, Nurmasani, Atik, Asti Astuti, Ika
Format Conference Proceeding
LanguageEnglish
Published IEEE 16.11.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Plagiarism is one of the actions that have the potential to occur in the academic environment. A real example is the plagiarism of assignments during the lecture process. One of the efforts to prevent plagiarism is to use a system that has been developed in recent years. However, the price of subscribing to the system for a period is quite high. Therefore, we need a plagiarism detection system that everyone can access for free. In developing the system, of course, modeling must be done. Two popular algorithms are often used to detect word similarity: Rabin-Karp and Winnowing. In this study, a performance comparison was made between Rabin-Karp and Winnowing. Comparisons were made using the same dataset of 30 abstraction data from scientific publications. Based on the research, the performance of the Rabin-Karp algorithm and the Winnowing algorithm is almost as good. However, based on the two evaluation parameters used by the Winnowing algorithm, it produces more similarity scores than Rabin-Karp. The same thing also applies to the evaluation of processing time. Winnowing is slightly longer than Rabin-Karp. With the results of this comparison, it can be concluded that if you want a model sensitive to similarity, then Winnowing is recommended. However, if processing time is the target, it is recommended to use Rabin-Karp. In future research, tuning the parameter values of k or n in the Rabin-Karp and Winnowing algorithms is recommended. In addition, it is also recommended to experiment with another distance scoring.
DOI:10.1109/ICIMCIS56303.2022.10017488