intrinsic plagiarism detection
Recently Published Documents


TOTAL DOCUMENTS

18
(FIVE YEARS 5)

H-INDEX

5
(FIVE YEARS 1)

2020 ◽  
Vol 4 (5) ◽  
pp. 988-997
Author(s):  
Sylvia Putri Gunawan ◽  
Lucia Dwi Krisnawati ◽  
Antonius Rachmat Chrismanto

Two different paradigms in the field of plagiarism detection resulting in External Plagiarism Detection (EPD) and Intrinsic Plagiarism Detection (IPD) systems. The most common applied system is EPD, which requires its algorithm to make a heuristic comparison between a suspicious document with documents in a corpus. In contrast, given a suspicious document only, an algorithm of IPD should be able to find the plagiarism section by looking for text segments having different writing styles. Previous researches for Indonesian texts fell only in the field of the EPD development system. Therefore, this research focuses on and contributes to experimenting and analyzing the stylometric features and segmentation strategies to build an IPD system for Indonesian texts. The experimentation results show that the paragraph segment performs better by scoring 0.92 for Macro Averaged-Accuracy and 0.54 for Macro Averaged-F1. The stylometric features achieving the highest scores of F-1 and Accuracy are the frequency of punctuation, the average paragraph length, and the type-token ratio.  


2019 ◽  
Vol 96 ◽  
pp. 700-712 ◽  
Author(s):  
Muna AlSallal ◽  
Rahat Iqbal ◽  
Vasile Palade ◽  
Saad Amin ◽  
Victor Chang

Author(s):  
Michael P. Oakes

Author profiling is the analysis of people’s writing in an attempt to find out which classes they belong to, such as gender, age group or native language. Many of the techniques for author profiling are derived from the related task of Author Identification, so we will look at this topic first. Author identification is the task of finding out who is most likely to have written a disputed document, and there are a number of computational approaches to this. The three main subtasks are the compilation of corpora of texts known to be written by the candidate authors, the selection of linguistic features to represent those texts, and statistics for discriminating between those features which are most indicative of a particular author’s writing style. Plagiarism is the unacknowledged use of another author’s original work, and we will look at software for its detection. The chapter will cover the types of text obfuscation strategies used by plagiarists, commercial plagiarism detection software and its shortcomings, and recent research systems. Strategies have been developed for both external plagiarism detection (where the original source is searched for in a large document collection) and intrinsic plagiarism detection (where the source text is not available, necessitating a search for inconsistencies within the suspicious document). The specific problems of plagiarism by translation of an original in another language, and the unauthorized copying of sections of computer code, are described. Evaluation forums and publicly available test data sets are covered for each of the main topics of this chapter.


2018 ◽  
Vol 11 (3) ◽  
pp. 503-515 ◽  
Author(s):  
Andrianna Polydouri ◽  
Eleni Vathi ◽  
Georgios Siolas ◽  
Andreas Stafylopatis

2014 ◽  
Author(s):  
Imene Bensalem ◽  
Paolo Rosso ◽  
Salim Chikhi

Sign in / Sign up

Export Citation Format

Share Document