Share

Export Citation

APA
MLA
Chicago
Harvard
Vancouver
BIBTEX
RIS
Universitas Hasanuddin
Research output:Contribution to journalArticlepeer-review

Document similarity detection using K-Means and cosine distance

Usino W.

International Journal of Advanced Computer Science and Applications

Q3
Published: 2019Citations: 17

Abstract

A two-year study by the Ministry of Research, Technology and Education in Indonesia presented the evaluation of most universities in Indonesia. The findings of the evaluation are the peculiarities of various dissertation softcopies of doctoral students which are similar to any texts available on internet. The suspected plagiarism behavior has a negative effect on both students and faculty members. The main reason behind this behavior is the lack of standardized awareness among faculty members with regard to plagiarism. Therefore, this study proposes a computerized system that is able to detect plagiarism information by using K-means and cosine distance algorithm. The process starts from preprocessing process that includes a novel step of checking Indonesian big dictionary, vector space model design, and the combined calculation of K-means and cosine distance from 17 documents as test data. The result of this study generally shows that the documents have detection accuracy of 93.33%.

Other files and links

Fingerprint

Cosine similaritySciences
Computer scienceSciences
Plagiarism detectionSciences
Similarity (geometry)Sciences
Christian ministrySciences
Process (computing)Sciences
The InternetSciences
IndonesianSciences
Trigonometric functionsSciences
Space (punctuation)Sciences
Vector space modelSciences
Information retrievalSciences
Test (biology)Sciences
TrigonometrySciences
PreprocessorSciences
Artificial intelligenceSciences
World Wide WebSciences
Pattern recognition (psychology)Sciences
MathematicsSciences
Operating systemSciences
GeometrySciences
BiologySciences
Image (mathematics)Sciences
PaleontologySciences
TheologySciences
PhilosophySciences
LinguisticsSciences