Performance Comparison of Rabin-Karp Algorithm and Winnowing Algorithm for Document Abstraction Similarity Detection
Plagiarism is one of the actions that have the potential to occur in the academic environment. A real example is the plagiarism of assignments during the lecture process. One of the efforts to prevent plagiarism is to use a system that has been developed in recent years. However, the price of subscr...
Saved in:
Published in | 2022 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS) pp. 281 - 286 |
---|---|
Main Authors | , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
16.11.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Plagiarism is one of the actions that have the potential to occur in the academic environment. A real example is the plagiarism of assignments during the lecture process. One of the efforts to prevent plagiarism is to use a system that has been developed in recent years. However, the price of subscribing to the system for a period is quite high. Therefore, we need a plagiarism detection system that everyone can access for free. In developing the system, of course, modeling must be done. Two popular algorithms are often used to detect word similarity: Rabin-Karp and Winnowing. In this study, a performance comparison was made between Rabin-Karp and Winnowing. Comparisons were made using the same dataset of 30 abstraction data from scientific publications. Based on the research, the performance of the Rabin-Karp algorithm and the Winnowing algorithm is almost as good. However, based on the two evaluation parameters used by the Winnowing algorithm, it produces more similarity scores than Rabin-Karp. The same thing also applies to the evaluation of processing time. Winnowing is slightly longer than Rabin-Karp. With the results of this comparison, it can be concluded that if you want a model sensitive to similarity, then Winnowing is recommended. However, if processing time is the target, it is recommended to use Rabin-Karp. In future research, tuning the parameter values of k or n in the Rabin-Karp and Winnowing algorithms is recommended. In addition, it is also recommended to experiment with another distance scoring. |
---|---|
DOI: | 10.1109/ICIMCIS56303.2022.10017488 |