Spam Detection Research Papers

Biomedical Article Classification Using an Agent-Based Model of T-Cell Cross-Regulation

2025, Lecture Notes in Computer Science

We propose a novel bio-inspired solution for biomedical article classification. Our method draws from an existing model of T-cell cross-regulation in the vertebrate immune system (IS), which is a complex adaptive system of millions of... more

descriptionView Paper arrow_downwardDownload

The Best Kept Secrets with Corpus Linguistics

by neil cooke

2025

This paper presents the use of corpus linguistics techniques on supposedly "clean" corpora and identifies potential pitfalls. Our work relates to the task of filtering sensitive content, in which data security is strategically important... more

descriptionView Paper arrow_downwardDownload

Intellectual property escaped with the email? Press F1 for help

by neil cooke

2025

In this paper we describe an approach to information assurance in which we can prevent breach of confidentiality. Specifically, we examine aspects of the propagation of confidential information via email. Email provides one simple... more

descriptionView Paper arrow_downwardDownload

SPAM DETECTION BASED ON FUSION OF SPAMMER BEHAVIOR FEATURES AND LINGUISTIC FEATURES AMNA IQBAL

by murad khan and

2025, tianjindaxuexuebao

E-commerce sites, forums, and blogs have become popular platforms for people to share their views. Reviews have emerged as a crucial source of information for potential customers, influencing their purchasing decisions. Similarly for... more

descriptionView Paper arrow_downwardDownload

Analysis and Challenges in Detecting the Fake Reviews of Products using Naïve Bayes and Random Forest Techniques

by Tanveer Sajid

2025

In today’s world, fake review identification and prediction is an importantarea of sentiment analysis of the E-commerce industry. The automatic fake review categorizersidentify and categorize a variety of duplicate, spam, fake and... more

descriptionView Paper arrow_downwardDownload

Artificial and Natural Topic Detection in Online Social Networks

by GABRIEL FILIPE RODRIGUES TAVARES

2025, iSys - Brazilian Journal of Information Systems

Online Social Networks (OSNs), such as Twitter, offer attractive means of social interactions and communications, but also raise privacy and security issues. The OSNs provide valuable information to marketing and competitiveness based on... more

descriptionView Paper arrow_downwardDownload

An Improved Framework for Content-based Spamdexing Detection

by Dr Asim Shahzad

2025

To the modern Search Engines (SEs), one of the biggest threats to be considered is spamdexing. Nowadays spammers are using a wide range of techniques for content generation, they are using content spam to fill the Search Engine Result... more

descriptionView Paper arrow_downwardDownload

Cutting-Edge Techniques for Detecting Fake Reviews

by Yogesh U Bodhe

2025, EAI Endorsed Transactions on AI and Robotics

The paper reviews various approaches for detecting fake reviews using different machine learning techniques, each with distinct strengths and limitations. It examines existing literature on supervised learning methods, unsupervised... more

descriptionView Paper arrow_downwardDownload

SMS Spam Detection Using Machine Learning: An Experimental Study

by WARSE The World Academy of Research in Science and Engineering

2025, International Journal of Emerging Trends in Engineering Research

The exponential growth of mobile communication has intensified the threat of SMS spam, compromising user security and trust in messaging platforms. This study addresses this challenge by designing and deploying a robust spam detection... more

descriptionView Paper arrow_downwardDownload

Character Recognition using Support Vector Classifier (SVC)

by Sakshi Panwar

2025

In this paper, the clustering Algorithm known as Support Vector Classifier (SVC) is used. SVC offers classifiers such as logistic regression and decision trees that provide very high accuracy compared to others. The model first... more

descriptionView Paper arrow_downwardDownload

Advances in NLP Techniques for Detection of Message-Based Threats in Digital Platforms: A Systematic Review

by José Saias

2025, Electronics

Users of all ages face risks on social media and messaging platforms. When encountering suspicious messages, legitimate concerns arise about a sender's malicious intent. This study examines recent advances in Natural Language Processing... more

descriptionView Paper arrow_downwardDownload

E-Mail Spam Detection using Machine Learning and Deep Learning

by Shivam Pandey

2025, International Journal for Research in Applied Science and Engineering Technology

descriptionView Paper arrow_downwardDownload

Social Media, Definition and History

by Andreas Kaplan

2025, Springer eBooks

descriptionView Paper arrow_downwardDownload

Building up a Verified Page on Facebook Using Information Transparency Guidelines

by Claudia Cappelli

2025, Social Computing and Social Media. Applications and Analytics

Online credibility is a quality pursued by users, business and brands on Internet. Having a verified page on Facebook means improvement of the social web presence, reliability and reinforcement of security against impersonations of... more

descriptionView Paper arrow_downwardDownload

Artificial intelligence for automatic moderation of textual content in online chats and social networks

by International Journal of Electrical and Computer Engineering (IJECE)

2025, International Journal of Electrical and Computer Engineering (IJECE)

The article explores fundamental techniques for converting text into numerical data for machine learning algorithms. It meticulously examines various methods, including word vector representation via neural networks like Word2Vec, and... more

descriptionView Paper arrow_downwardDownload

The successful low-cost deployment of a secure Email gateway

by Christophe Gaboret

2025, it-sudparis.eu

Communicating by email has become crucial for all companies today. However, a significant amount of undesirable messages pass through mail servers. Such unwanted and unsolicited communications are responsible for a significant loss of... more

descriptionView Paper arrow_downwardDownload

Web Spam Detection Using Different Features

by Sumit Sahu

2025

descriptionView Paper arrow_downwardDownload

EXPLORING ARTIFICIAL INTELLIGENCE: FOUNDATIONS, DATA ANALYSIS, AND MACHINE LEARNING IMPLEMENTATION USING PYTHON SESSION -2023-25 SUBMITTED IN SRISIIM-BHARATI

by Neelam Gupta

2025

this research project acts as a foundational guide for students and early-career professionals interested in AI. It combines academic theory, technical practice, and ethical reflection to provide a holistic understanding of AI and machine... more

descriptionView Paper arrow_downwardDownload

North, South, Least, Best: Geographical Location and the Thinking Styles of Italian University Students

by Cinzia COLAPINTO

2025, Australian Journal of Adult Learning

Abstract: There are economic and socio-cultural differences that characterise the north and south of Italy. A stereotype is that university students from rural southern Italy are more disadvantaged and isolated than those from the urban... more

descriptionView Paper arrow_downwardDownload

SMS Spam Detection and Classification to Combat Abuse in Telephone Networks Using Natural Language Processing

by Oyeyemi Dare Azeez and

2025, Journal of Advances in Mathematics and Computer Science

In the modern era, mobile phones have become ubiquitous, and Short Message Service (SMS) has grown to become a multi-million-dollar service due to the widespread adoption of mobile devices and the millions of people who use SMS daily.... more

descriptionView Paper arrow_downwardDownload

Smart Spam Detection of SMS Using Ensemble Methods of Machine Learning

by Kesavan K7

2025

The rapid proliferation of mobile communication has kept Short Message Service (SMS) a popular channel for personal and business communications. As the usage rose, the malicious exploitation followed, leading to the increased flow of... more

descriptionView Paper arrow_downwardDownload

A Survey on Extraction Approach for Spam Filtering

by rajesh nigam

2025

With the growth of networking the usage of mails are also enhanced. Due to rapid growth of internet, dependency of communication is mostly based on electronics mails for both commercial and business purposes. According to today's... more

descriptionView Paper arrow_downwardDownload

A Survey on Extraction Approach for Spam Filtering

by rajesh nigam

2025

With the growth of networking the usage of mails are also enhanced. Due to rapid growth of internet, dependency of communication is mostly based on electronics mails for both commercial and business purposes. According to today's... more

descriptionView Paper arrow_downwardDownload

Email Spam Detector Research Paper

by Ajinkya Pratap Singh

2025

The widespread use of email as a primary communication medium has led to an increase in spam messages, which pose significant threats to privacy, productivity, and cybersecurity. Spam emails, often disguised as legitimate messages, can... more

descriptionView Paper arrow_downwardDownload

PhishGuard: A Machine Learning Framework for Windows-Specific Phishing Detection

by IJRASET Publication

2025, International Journal for Research in Applied Science & Engineering Technology (IJRASET)

Phishing remains one of the most prevalent and evolving cybersecurity threats, exploiting humanvulnerabilities through deceptive digital communication. This study proposes a dynamic, Windows-specific phishing detection model leveraging... more

descriptionView Paper arrow_downwardDownload

Optimizing Spam Email Detection Accuracy Using Advanced Machine Learning Techniques

by Manish Sharma

2025, IJSDR

The number of email users is increasing every day worldwide. If you have to communicate officially with someone, whether in a business matter or with someone else, your electronic mail is the best option. When identifying the emails,... more

descriptionView Paper arrow_downwardDownload

A Machine Learning approach for Fake Profile Classification in Social Networking

by editor ijircst

2025, International Journal of Innovative Research in Computer Science and Technology (IJIRCST)

This study suggests a machine learning-based detection system built with Python and Django to tackle the growing problem of fraudulent profiles on social networking sites. Malicious actors are progressively setting up phony identities for... more

descriptionView Paper arrow_downwardDownload

Data Poisoning Attacks on EEG Signal-based Risk Assessment Systems

by Sani Umar

2025, arXiv (Cornell University)

Industrial insider risk assessment using electroencephalogram (EEG) signals has consistently attracted a lot of research attention. However, EEG signal-based risk assessment systems, which could evaluate the emotional states of humans,... more

descriptionView Paper arrow_downwardDownload

Fake news Detection on Social Media Using Machine Learning

by TANZEELA ZARGER

2025, Journal Of Electrical Systems

The spread of misinformation on the internet and social media has become a growing problem, impacting public opinion and decisionmaking. In this study, we explore how machine learning can be used to detect and classify fake news more... more

descriptionView Paper arrow_downwardDownload

Evaluating spam filters and Stylometric Detection of AI-generated phishing emails

by Paolo Modesti

2025, Expert Systems With Applications

The advanced architecture of Large Language Models (LLMs) has revolutionised natural language processing, enabling the creation of text that convincingly mimics legitimate human communication, including phishing emails. As AI-generated... more

descriptionView Paper arrow_downwardDownload

Dispositivos y estrategias lingüístico-discursivas en narrativas de spams

by Miriam Jiménez Bernal

2025, Círculo de Lingüística Aplicada a la Comunicación

Nuestro objetivo con este artículo es relacionar las emociones y los estereotipos de género por medio del análisis de las narrativas que contienen los spams. Presentaremos cuatro diferentes tipos de spam de 450 correos electrónicos, que... more

descriptionView Paper arrow_downwardDownload

Discursive-Linguistic Devices and Strategies in Spam E-Mail Narratives

by Miriam Jiménez Bernal

2025, International Journal of Information Communication Technologies and Human Development

In this article, we aim at relating emotions and gender stereotypes through the analysis of the narratives contained in spam e-mails. We will present four different types of spam e-mails from a corpus consisting of 450 emails, their... more

descriptionView Paper arrow_downwardDownload

A Machine Learning For Phishing Detection In Healthcare

by Twum Samuel

2025

Phishing, a social Engineering attacks are becoming more common because of the quick digitization of healthcare services. Phishing is often used to steal user data, including login credentials and credit card numbers. It occurs when an... more

descriptionView Paper arrow_downwardDownload

Triangle Estimation using Polylogarithmic Queries

by Arijit Bishnu

2025, ArXiv

Estimating the number of triangles in a graph is one of the most fundamental problems in sublinear algorithms. In this work, we provide the first approximate triangle counting algorithm using only polylogarithmic queries. Our query oracle... more

descriptionView Paper arrow_downwardDownload

Tree-Based Supervised Learning Approach for Detection of Phishing Email Attacks

by Musa Yusuf

2025, Sule Lamido University Journal of Science & Technology

Phishing, a prevalent social engineering attack, is frequently employed by cybercriminals to steal sensitive user data, including login credentials, credit card information, and confidential government or corporate details. Despite... more

descriptionView Paper arrow_downwardDownload

A Comparative Study of Machine Learning Classifiers for Different Language Spam SMS Detection: Performance Evaluation and Analysis

by Samrat Kumar Dev Sharma

2025, Advances in Artificial Intelligence Research

With the continuous rise in the number of mobile device users, SMS (Short Message Service) remains a prevalent communication tool accessible on both smartphones and basic phones. Consequently, SMS traffic has experienced a significant... more

descriptionView Paper arrow_downwardDownload

Development of a Machine Learning Model for Image-based Email Spam Detection

by Christopher U . ONOVA

2025, FUOYE Journal of Engineering and Technology,

Combatting email spam has remained a very daunting task. Despite the over 99% accuracy in most non-image-based spam email detection, studies on image-based spam hardly attain such a high level of accuracy as new email spamming techniques... more

descriptionView Paper arrow_downwardDownload

Fast Ranking System for Proposed Page Rank Algorithm using Markov Chain

by khaing thanda

2025

Information retrieval on the web is significant and furthermore complex activity for web mining. Because of enormously increased the number of websites on the Internet, the execution of PageRank Algorithm should be easy and faster in... more

descriptionView Paper arrow_downwardDownload

Query-log mining for detecting polysemy and spam

by CARLOS CASTILLO

2025, Proceedings of the KDD Workshop on Web Mining and Web Usage Analysis (WEBKDD)

Abstract. Through their interaction with search engines, users provide implicit feedback that can be used to extract useful knowledge and improve the quality of the search process. This feedback is encoded in the form of a query log that... more

descriptionView Paper arrow_downwardDownload

Recent Trends & Innovations in Science, Engineering and Social Sciences

by BalaMurali Krishna K V

2025

Online Social Networks (OSNs) at present engages the majority of individuals, from the newborn to the elderly. Social networks provide direct connection between people and increased information sharing opportunities to provide a large... more

descriptionView Paper arrow_downwardDownload

Semantic Analysis of Tags to Renovate Folksonomies in Social Tagging System

by Annie Dafney

2025

Grouping resources into set of classes allows easy access to the resources we use in our day-to-day lives. This classification makes the search faster and easier. The process of classifying the resources manually becomes expensive. This... more

descriptionView Paper arrow_downwardDownload

Spam detection by using machine learning based binary classifier

by Mohd Fadzil Abdul Kadir

2025, Indonesian Journal of Electrical Engineering and Computer Science

Because of its ease of use and speed compared to other communication applications, email is the most commonly used communication application worldwide. However, a major drawback is its inability to detect whether mail content is either... more

descriptionView Paper arrow_downwardDownload

Leveraging Vectorization Techniques for Malicious Website Detection With Machine Learning

by Chitra Baskar

2025, Iragi journal of science

Malicious websites are those that are created to harm visitors or exploit their information for illegal purposes. These websites are commonly utilized in attacks, such as phishing, malware distribution, and scams. Clicking on a malicious... more

descriptionView Paper arrow_downwardDownload

Enhancing detection of zero-day phishing email attacks in the Indonesian language using deep learning algorithms

by beei iaes

2025, Bulletin of Electrical Engineering and Informatics

Email phishing is a manipulative technique aimed at compromising information security and user privacy. To overcome the limitations of traditional detection methods, such as blacklists, this research proposes a phishing detection model... more

descriptionView Paper arrow_downwardDownload

The pipeline processing of NLP

by Зилола Ю Л Д А Ш Е В Н А Хусаинова

2025

The problem of NLP should be divided into several small parts and solved step by step. In this article, where NLP is necessary at every stage of solving the problem, all forms of text processing are considered. The step-by-step text... more

English version tokenization process of sentences

From the table above, it can be seen that suffixes such as -lar, -im, -ing, -imiz, -si, -n: e added directly to the stem. It is added to the left side of another affix morpheme. The ffixes -man, -san, -miz, -siz cannot be added directly to the stem. Because before them. other morpheme (for example, a tense suffix) is added to the stem, and the next position occupied by this group of suffixes, which have the characteristic of being located on the ght side of affix morphemes. So, in the Uzbek language, the position of the stem anc ffixes is minimally as follows:

In Figure 5 below, is presented the algorithm for determining the content of words:

Fig. 5. Algorithm for determining word structure in Uzbek language

Fig. 6. Examples of the lemmatization process It should be noted that stemming and lemmatization should be distinguished. Because often the results of stemming and lemmatization in the Uzbek language seem to be the same, but they are completely different processes: stemming is the process of removing the suffix from the stem, and lemmatization is the process of determining the dictionary variation of the stem. Below are examples of the analysis of the stemming and lemmatization process:

Fig. 8. Common primary processing steps for a text fragment Not all steps to remove non-essential words, numbers, punctuation, and lowercase letters from a given text are always necessary. For example, if one removes numbers and punctuation marks from text, the removal may not be very important at first. However, one usually replaces uppercase letters with lowercase letters before performing the stemming process on the text.

Fig. 9. NLP pipeline processing ER model

Fig. 12. Extended processing steps for a text fragment

Fig. 13. An example of POS tagging and sentence structure in the NLP pipeline processing

Table 1. Common non-essential words in Uzbek language

TABLE 2. The morphemes set on the left side

Fig. 10. Morphoanalyzer of the Uzbek language (http://uznatcorpara.uz/) ‘“Sumbula sapchib o'midan turdi-da: «Xayriyat, tushim ekan», — dedi hansirab”

descriptionView Paper arrow_downwardDownload

Analysing the Twitter social graph: Whom can we trust?

by Evgeny Morozov

2025

Figure 2: Verified users are followed by more people than they follow. “Followings” are users whom verified users follow. “Followers” are users who follow verified users. The boxplot (outliers are not shown) shows that verified users tend to follow restrictively although they have a large number of followers. This is also shown by the median and maximum values of the number of followings and the number followers of verified users. Therefore, verified users follow selectively, thereby unintentionally providing a checkpoint on the trust of other users in the Twitter graph.

The number of users of each degree (in the 2009 data set) is shown in the Figure 4.

Figure 5: Average total followers per user by degree of trust (log scale)

Figure 6: Ratio R; of the number of verified followers to the total number of followers for all users by category: verified users, users of trust degree 1, experts, and other users

We then plotted the CDF of R2 for all users by category given in Figure 7. R2 upholds the notion of trust as verified users have a higher value than experts. But the problem with this ratio was that being a ratio, the value of R2 was always less than | and it was difficult to find the different trust levels since R> was a continuous metric. We would rather prefer to have a discrete metric.

Figure 8: Number of verified followers for unverified users. X-axis is in log scale.

| Results: The trusted category of users Now that we have seen that the trust score takes distinct values from 0 to 400, the nex question is how to draw the line between who can be trusted and who can’t. We know thé the higher the trust score, the more trusted the person. But we still cannot categorise user into the fourth category of users (trusted users) explained in section 2.1. Above whe threshold value of the score do we classify users to be “trusted”? This is a subjectiv question, and we try to answer that, by first showing how the number of unverified truste users’ varies with increasing threshold values (Figure 11) and then analysing the number o rusted users at some selected threshold levels. Because of the subjective nature of th question, there is no definite answer. Now that we have seen that the trust score takes distinct values from 0 to 400, the next

We have taken four sample threshold values to analyse the most optimal threshold value for trust. In the table that follows, we see that there are more than 8.2 million trusted users with a threshold of 2, which means that this threshold is not very stringent, and therefore is not the best choice. We see that more than 600 thousand users are trusted at a threshold of 10, and this seems to be a good value because it makes 0.132% of the users trusted, as compared to the 0.006% verified users. This threshold level is neither too restrictive nor stringent. If we take the threshold of 53, we obtain 29 thousand trusted users, which is almost equal to the 28.8 thousand verified users.

Figure 12: The number of users in each score value from 0 to 400 by user category. The verified users and the experts have a similar curve — long tail (which almost merges in the end) and slower decay than other users. This shows that, according to the trust score, experts are more like verified users than other users.

Figure 13: Although there are much more “other” users, at any threshold level above 6, there are more trusted experts than other users, and this shows that the score upholds the property of experts being influential and having ¢ large impact on Twitter, and therefore are more trusted. Please note that verified users are not shown in this plot.

Figure 14: Scatter plot of the trust score vs the total number of followers of all users. There is no correlation, which means that the trust score cannot be predicted using the number of followers of a user. The red line shows the maximum score that a user with a given number of followers can have.

The goal was to inspect the trust metrics (mentioned in the previous section) on the 2012 dataset but because of its huge size, we did not work on it until we were sure what method we wanted to apply. After we deduced that the trust score is the best metric for measuring trust, we applied it on the 2012 dataset. But the raw graph (adjacency list) was too huge and required some data processing for making it efficiently accessible by our algorithms. The 2012 dataset has verified users, experts and other users like the 2009 dataset (all distinct

1.1 The new population We have now seen how the threshold is used to divide the set of unverified users into trusted and not trusted users. But as shown in Figure 1, the interesting subset of users is the trusted set of users in the “other” category (defined in Section 2.1) of users. The trusted users in the “other” category are neither verified, nor are they experts, but are trusted by our method.

before. In this section we show how the size of this population changes with the threshold. The percentage of trusted users in the “other” category drops significantly as the threshold is increased. And this shows that experts are more trusted, which is why they are called influential as well. But at the threshold of 5, there are 1.1 million trusted users from the “other” category, which is quite a significant amount considering that there are 2.8 million expert users and only 28,819 verified users. The number of trusted users from the “other” category is more than one-third of the experts’ category in size. Thus we have achieved our goal of finding a new population of significant size of trusted users on Twitter.

descriptionView Paper arrow_downwardDownload

Fame for sale: Efficient detection of fake Twitter followers

by Maurizio Tesconi

2025, Decision Support Systems

Fake followers are those Twitter accounts specifically created to inflate the number of followers of a target account. Fake followers are dangerous for the social platform and beyond, since they may alter concepts like popularity and... more

descriptionView Paper arrow_downwardDownload

A Fake Follower Story: improving fake accounts detection on Twitter

by Maurizio Tesconi

2025

Fake followers are those Twitter accounts created to inflate the number of followers of a target account. Fake followers are dangerous to the social platform and beyond, since they may alter concepts like popularity and influence in the... more

descriptionView Paper arrow_downwardDownload

Comparative Graph Theoretical Characterization of Networks of Spam

by Luis Bettencourt

2025, Conference on Email and Anti-Spam

Email is an increasingly important and ubiquitous means of communication, both facilitating contact between individuals and enabling rises in the productivity of organizations. However, the relentless rise of automatic unauthorized... more

descriptionView Paper arrow_downwardDownload

Auto-Grouping Emails For Faster E-Discovery

by Sachindra Joshi

2025

In this paper, we examine the application of various grouping techniques to help improve the efficiency and reduce the costs involved in an electronic discovery process. Specifically, we create coherent groups of email documents which... more

descriptionView Paper arrow_downwardDownload

Spam Detection

Related Topics