Outlier-based approaches for intrinsic and external plagiarism detection

Oberreuter G.; L'Huillier G.; Ríos S.A.; Velásquez J.D.

Keywords: systems, search, information, reduction, research, space, classification, statistics, knowledge, algorithms, text, data, detection, complexity, mining, property, institutions, teams, plagiarism, processing, outlier, intellectual, Computational, Educational, based, engines, Text-matching

Abstract

Plagiarism detection, one of the main problems that educational institutions have been dealing with since the massification of Internet, can be considered as a classification problem using both self-based information and text processing algorithms whose computational complexity is intractable without using space search reduction algorithms. First, self-based information algorithms treat plagiarism detection as an outlier detection problem for which the classifier must decide plagiarism using only the text in a given document. Then, external plagiarism detection uses text matching algorithms where it is fundamental to reduce the matching space with text search space reduction techniques, which can be represented as another outlier detection problem. The main contribution of this work is the inclusion of text outlier detection methodologies to enhance both intrinsic and external plagiarism detection. Results shows that our approach is highly competitive with respect to the leading research teams in plagiarism detection. © 2011 Springer-Verlag.

Más información

Título de la Revista: LEARNING AND INTELLIGENT OPTIMIZATION, LION 15
Volumen: 6882
Número: PART 2
Editorial: SPRINGER INTERNATIONAL PUBLISHING AG
Fecha de publicación: 2011
Página de inicio: 11
Página final: 20
URL: http://www.scopus.com/inward/record.url?eid=2-s2.0-80053156982&partnerID=q2rCbXpz