Kl. Verco et Mj. Wise, PLAGIARISM A LA MODE - A COMPARISON OF AUTOMATED SYSTEMS FOR DETECTING SUSPECTED PLAGIARISM, Computer journal, 39(9), 1996, pp. 741-750
Early automated systems for detecting plagiarism in student programs e
mployed attribute counting techniques in their comparisons of program
texts, while more recent systems use encoded structural information. W
hales claims that the latter are more effective in their detection of
plagiarisms than systems based on attribute counting, To explore the v
alidity of these claims, a comparison is presented of five systems, tw
o based on attribute counting and three using metrics based on structu
re, The major result of this study is that the systems based on struct
ural information consistently equal or better the performance of syste
ms based on attribute counting metrics. A second conclusion is that of
the structure metric systems, one using approximate tokenization of i
nput texts (YAP) is as effective as a system that undertakes a complet
e parse (Plague), Approximate tokenization offers a considerable reduc
tion in the costs of porting to new languages, A distinction is also m
ade between forms of plagiarism common among novice programmers and th
ose employed by more experienced programmers.