Academia.eduAcademia.edu

Plagiarism Detection

description1,905 papers
group23,424 followers
lightbulbAbout this topic
Plagiarism detection is the process of identifying instances of copied or improperly attributed content in written works. It involves the use of software tools and algorithms to compare texts against a database of sources, assessing originality and ensuring academic integrity in scholarly and professional writing.
lightbulbAbout this topic
Plagiarism detection is the process of identifying instances of copied or improperly attributed content in written works. It involves the use of software tools and algorithms to compare texts against a database of sources, assessing originality and ensuring academic integrity in scholarly and professional writing.

Key research themes

1. How can linguistic and semantic features improve the accuracy of external plagiarism detection in text documents?

This research area focuses on enhancing plagiarism detection accuracy by integrating linguistic knowledge, including semantic relations and syntactic structures, to better capture the meaning of texts beyond surface similarity. It addresses challenges such as detecting paraphrasing, synonym substitution, sentence structure transformation, and active-passive voice changes, which traditional text-matching approaches often miss. These methods matter because they offer more robust detection of sophisticated plagiarism forms that rely on disguising copied ideas through language manipulation, thereby improving the reliability of plagiarism detection tools.

Key finding: This paper proposed the PDLK method that jointly computes semantic and syntactic similarity between sentences using lexical databases and sentence structural analysis. It demonstrated improved detection of various plagiarism... Read more
Key finding: This study introduced a specialized plagiarism detection approach combining semantic role labeling (SRL) to identify active-passive sentence transformations, syntactic word-order information, and content word expansion. By... Read more
Key finding: This work presented a text plagiarism detection system that extracted 34 sentence similarity features encompassing lexical, syntactic, and semantic properties, followed by feature selection with Chi-square and classification... Read more

2. What computational strategies can reduce the retrieval space and improve efficiency in detecting disguised plagiarism using citations and semantic similarity?

The theme addresses the challenge of detecting disguised plagiarism (e.g., paraphrases, translations, idea plagiarism) efficiently at large document scale, given the high computational cost of semantic similarity measures. It explores hybrid approaches that use citation-based heuristics to preliminarily narrow the candidate reference documents before applying more expensive semantic and character-based analyses. This integration strives to optimize detection accuracy while maintaining feasible runtime for real-world applications.

Key finding: This paper proposed a hybrid plagiarism detection method that employs citation-based heuristics as an initial filter to reduce the set of candidate documents for detailed comparison. By leveraging unique citation patterns to... Read more
Key finding: Although primarily linguistic in focus, this method's reliance on sentence-to-sentence semantic and syntactic comparisons implies substantial computational overhead, thus illustrating the need for retrieval space reduction as... Read more
Key finding: This research’s integration of semantic role labeling and syntactic analysis also implies computational intensity, underscoring the importance of combining efficient document filtering techniques such as citation based... Read more

3. How effective are current AI-generated text detection tools, and how do adversarial modifications affect their performance in the context of academic integrity?

This research strand evaluates automated AI-content detection tools' ability to discriminate human-written from generative AI-generated text. It critically examines tools like Turnitin, ZeroGPT, GPTZero, and Writer AI against texts created by leading large language models (ChatGPT, Perplexity, Gemini), including scenarios where texts are paraphrased or edited adversarially. This work is crucial for academic integrity frameworks seeking reliable detection of AI-assisted writing and understanding weaknesses of detection technologies under intentional obfuscation.

Key finding: This study systematically benchmarked four AI-detection tools against texts generated by three different LLMs and subjected to three adversarial modification methods. Turnitin showed the highest accuracy and consistent... Read more
Key finding: This review highlighted the growing research and application of machine learning and semantic analysis techniques to detect complex and obfuscated plagiarism, including emerging forms involving AI-generated texts. It... Read more

All papers in Plagiarism Detection

Plagiarism is regarded as a heinous crime within the academic community, but anecdotal evidence suggests that some writers plagiarize without intending to transgress academic conventions. This article reports a study of the writing of 17... more
A fundamental question in information theory and in computer science is how to measure similarity or the amount of shared information between two sequences. We have proposed a metric, based on Kolmogorov complexity to answer this... more
Cross-language plagiarism detection deals with the automatic identification and extraction of plagiarism in a multilingual setting. In this setting, a suspicious document is given, and the task is to retrieve all sections from the... more
The ways in which universities and individual academics attempt to deter and respond to student plagiarism may be based on untested assumptions about particular or primary reasons for this behaviour. Using a series of group interviews,... more
This paper will address the question of the morality of technology. I believe this is an important question for our contemporary society in which technology, especially information technology, is increasingly becoming the default mode of... more
In the context of an increasingly mobile student population, and Greek students specifically, this paper opens up and reveals the manner in which a specific culturally situated human actor (the Greek student) and a specific culturally... more
Countless cases of plagiarism are detected across the Australian higher education sector each year. Generally speaking, policy and other responses to the issue focus on punitive, rather than on educative, measures. Recently, a subtle... more
Motivation: Duplicate publication impacts the quality of the scientific corpus, has been difficult to detect, and studies this far have been limited in scope and size .Using text similarity searches, we were able to identify signatures of... more
... Full-text electronic research document corpuses have grown substantially over the past decade, and permit a ... to discover X and Y: they had copied sufficiently many pages, even from singlesources, that they ... We thank Patrick Ng... more
Existing literature provides insight into the nature and extent of plagiarism amongst undergraduate students (e.g., Parameswaran & Devi, 2006;. Plagiarism amongst graduate students is relatively unstudied, however, and the existing data... more
Laboratory work assignments are very important for computer science learning. Over the last 12 years many students have been involved in solving such assignments in the authors' department, having reached a figure of more than 400... more
Plagiarism is increasingly evident in business and academia. While links between demographic, personality, and situational factors have been found, previous research has not used actual plagiarism behavior as a criterion variable.... more
Plagiarism detection can be divided in external and intrinsic methods. Naive external plagiarism analysis suers from computationally demanding full near- est neighbor searches within a reference corpus. We present a conceptually simple... more
Plagiarism Detection Systems have been developed to locate instances of plagiarism e.g. within scientific papers. Studies have shown that the existing approaches deliver reasonable results in identifying copy&paste plagiarism, but fail to... more
Texts and their translations are a rich linguistic resource that can be used to train and test statistics-based Machine Translation systems and many other applications. In this paper, we present a working system that can identify... more
Plagiarism of material from the Internet is a widespread and growing problem. Computer science students, and those in other science and engineering courses, can sometimes get away with a" cut and paste" approach to assembling a... more
Empirical studies have revealed a disturbing prevalence of research misconduct in a wide variety of disciplines, although not, to date, in the areas of ethics and philosophy. This study aims to provide empirical evidence on perceptions of... more
with Esra Eret in Procedia - Social and Behavioral Sciences, 2(2), 3303-3307
This paper overviews 15 plagiarism detectors that have been evaluated within the fourth international competition on plagiarism detection at PAN'12. We report on their performances for two sub-tasks of external plagiarism detection:... more
Efficient detection of plagiarism in programming assignments of students is of a great importance to the educational procedure. This paper presents a clustering oriented approach for facing the problem of source code plagiarism. The... more
In this work, we determined, the level of incidence of the use of technologies on academic success and the incidence of interaction and experience on the level of plagiarism of university students. A sample of 10,952 students from 31... more
Despite the importance of Supreme Court opinions for the American polity, scholars have dedicated little systematic research to investigating the factors that contribute to the content of the Court's opinions. In this paper, we examine... more
Easy access to the Web has led to increased potential for students cheating on assignments by plagiarising others' work. By the same token, Web-based tools offer the potential for instructors to check submitted assignments for signs of... more
Publication ethics principles became one of the main aspects of conducting scientific research and presenting its results. Publication ethics challenges cover a wide range of problems of varying importance that involve all participants of... more
Back in 2004, Google Inc. (Menlo Park, CA, USA) began digitizing full texts of magazines, journals, and books dating back centuries. At present, over 25 million books have been scanned and anyone can use the service (currently called... more
The purpose of this study was to evaluate the effectiveness of plagiarism detection software and penalty for plagiarizing in detecting and deterring plagiarism among medical students. The study was a continuation of previously published... more
Easy access to the World Wide Web has raised concerns about copyright issues and plagiarism. It is easy to copy someone else's work and submit it as someone's own. This problem has been targeted by many systems, which use very similar... more
The paper presents the implementation of a tool for plagiarism detection developed within the AXMEDIS project. The algorithm leverages the plagiarist behaviour, which is modeled as a combination of 3 basical actions: insertion, deletion,... more
Various approaches for plagiarism detection exist. All are based on more or less sophisticated text analysis methods such as string matching, fingerprinting, style analysis etc. In this paper a new approach called Citation-based... more
These are published by MARCET Staff Development Resource Centre on behalf of the University. Papers in this series will report research-based and evidencebased approaches to learning, teaching assessment and learning supported in HIGHER... more
This paper argues that the inappropriate framing and implementation of plagiarism detection systems in UK universities can unwittingly construct international students as Fplagiarists_. It argues that these systems are often implemented... more
Much of the current literature on plagiarism focuses on students, attempting to understand how students view the concept of plagiarism, the best ways to prevent it, and the impact of collaboration on the concept of original authorship. In... more
The paper discusses self-plagiarism and associated practices in scholarly publishing. It approaches at some length the conceptual issues raised by the notion of self-plagiarism. It distinguishes among and then examines the main families... more
Ethics have received increased attention from the media and academia in recent years. Most reports suggest that one form of unethical conduct – plagiarism – is on the rise in the business schools. Stereotypes of Asian students as being... more
Detecting academic plagiarism is a pressing problem, e.g., for educational and research institutions, funding agencies, and academic publishers. Existing plagiarism detection systems reliably identify copied text, or near copies of text,... more
Existing anti-plagiarism tools are, in fact, text matching systems but do not make accurate judgments about plagiarism. Texts that are acceptable to be redundant and texts that are cited properly are all highlighted as plagiarism, and the... more
Detecting instances of plagiarism in student homework, especially programming homework, is an important issue for practitioners. In the past decades, several tools have emerged that are able to effectively compare large corpora of... more
Plagiarism is one form of academic dishonesty, which is often done by students in programming classes. In a large class, detecting plagiarism manually is both difficult and time-consuming、especially due to the numerous modifications of... more
Unauthorized re-use of code by students is a widespread problem in academic institutions, and raises liability issues for industry. Manual plagiarism detection is time-consuming, and current effective plagiarism detection approaches... more
This paper describes a new approach towards detecting plagiarism and scientific documents that have been read but not cited. In contrast to existing approaches, which analyze documents’ words but ignore their citations, this approach is... more
The ready availability of free online machine translation (MT) systems has given rise to a problem in the world of language teaching in that students -especially weaker ones -use free online MT to do their translation homework. Apart from... more
Most Australian universities have made attempts of various kinds to address plagiarism. Some have responded in recent times with a primary focus on catching and punishing plagiarists, often assisted by computer software packages. Others... more
Teaching computer engineering calls for an important practical component, usually covered by setting several laboratory exercises for each course. These exercises are specified as assignments by the teachers and have to be completed by... more
Download research papers for free!