Plagiarism Detection using ROUGE and WordNet
Journal of Computing, Volume 2, Issue 3, March 2010 With the arrival of digital era and Internet, the lack of information control provides an incentive for people to freely use any content available to them. Plagiarism occurs when users fail to credit the original owner for the content referred to,...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
22.03.2010
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Journal of Computing, Volume 2, Issue 3, March 2010 With the arrival of digital era and Internet, the lack of information control
provides an incentive for people to freely use any content available to them.
Plagiarism occurs when users fail to credit the original owner for the content
referred to, and such behavior leads to violation of intellectual property. Two
main approaches to plagiarism detection are fingerprinting and term occurrence;
however, one common weakness shared by both approaches, especially
fingerprinting, is the incapability to detect modified text plagiarism. This
study proposes adoption of ROUGE and WordNet to plagiarism detection. The
former includes ngram co-occurrence statistics, skip-bigram, and longest common
subsequence (LCS), while the latter acts as a thesaurus and provides semantic
information. N-gram co-occurrence statistics can detect verbatim copy and
certain sentence modification, skip-bigram and LCS are immune from text
modification such as simple addition or deletion of words, and WordNet may
handle the problem of word substitution. |
---|---|
DOI: | 10.48550/arxiv.1003.4065 |