Key research themes
1. How can linguistic and semantic features improve the accuracy of external plagiarism detection in text documents?
This research area focuses on enhancing plagiarism detection accuracy by integrating linguistic knowledge, including semantic relations and syntactic structures, to better capture the meaning of texts beyond surface similarity. It addresses challenges such as detecting paraphrasing, synonym substitution, sentence structure transformation, and active-passive voice changes, which traditional text-matching approaches often miss. These methods matter because they offer more robust detection of sophisticated plagiarism forms that rely on disguising copied ideas through language manipulation, thereby improving the reliability of plagiarism detection tools.
2. What computational strategies can reduce the retrieval space and improve efficiency in detecting disguised plagiarism using citations and semantic similarity?
The theme addresses the challenge of detecting disguised plagiarism (e.g., paraphrases, translations, idea plagiarism) efficiently at large document scale, given the high computational cost of semantic similarity measures. It explores hybrid approaches that use citation-based heuristics to preliminarily narrow the candidate reference documents before applying more expensive semantic and character-based analyses. This integration strives to optimize detection accuracy while maintaining feasible runtime for real-world applications.
3. How effective are current AI-generated text detection tools, and how do adversarial modifications affect their performance in the context of academic integrity?
This research strand evaluates automated AI-content detection tools' ability to discriminate human-written from generative AI-generated text. It critically examines tools like Turnitin, ZeroGPT, GPTZero, and Writer AI against texts created by leading large language models (ChatGPT, Perplexity, Gemini), including scenarios where texts are paraphrased or edited adversarially. This work is crucial for academic integrity frameworks seeking reliable detection of AI-assisted writing and understanding weaknesses of detection technologies under intentional obfuscation.