Email Spam Filter

description105 papers

group90 followers

lightbulbAbout this topic

An email spam filter is a software application designed to identify and block unsolicited or unwanted email messages, commonly known as spam. It employs various algorithms and techniques, such as keyword analysis and machine learning, to assess the likelihood of an email being spam and to protect users' inboxes from irrelevant or harmful content.

lightbulbAbout this topic

Key research themes

1. How can machine learning algorithms and feature engineering optimally detect and classify email spam?

This research area focuses on leveraging diverse machine learning models, including traditional classifiers like Naïve Bayes, SVM, Random Forest, and ensemble techniques, to improve spam email detection accuracy. A strong emphasis is placed on feature extraction and data preprocessing methods such as TF-IDF vectorization, word embeddings, and keyword analysis to enhance the discriminatory power of models. This theme matters because accurate spam detection reduces resource wastage, protects user privacy, and mitigates financial and phishing risks associated with spam emails.

Spam-Detection with Comparative Analysis and Spamming Words Extractions

by chetna kaushal

2024

Key finding: This study applied four machine learning and two deep learning models on combined datasets including TREC07 and Enron to classify spam emails and identify recurrent spam keywords. It found that advanced feature engineering... Read more

articleView Paper downloadDownload

Email Spam Detection Using Machine Learning

by IRJET Journal

2023, IRJET

Key finding: Utilizing TF-IDF text representation combined with machine learning algorithms such as Support Vector Machines (SVM), Random Forest, and Naïve Bayes, this work demonstrated how numerical features derived from NLP techniques... Read more

articleView Paper downloadDownload

Evaluation of Supervised Learning Models for Automatic Spam Email Detection

by Tsehay Assegie

2024, Research Square (Research Square)

Key finding: Through an empirical comparison of eight supervised models on a pre-processed and balanced email dataset, the study found Random Forest to consistently outperform others with an accuracy of 96.6%. The evaluation incorporated... Read more

articleView Paper downloadDownload

Spam based Email Identification and Detection using Machine Learning Techniques

by Joyece Jane

2023

Key finding: This paper systematically applies and compares machine learning algorithms such as Naïve Bayes, SVM, and ensemble methods alongside bio-inspired algorithms on multiple datasets with extensive preprocessing. It confirms that... Read more

articleView Paper downloadDownload

Optimizing Spam Email Detection Accuracy Using Advanced Machine Learning Techniques

by Manish Sharma

2025, IJSDR

Key finding: The study investigated the application of multiple machine learning algorithms including Random Forest, Logistic Regression, Naïve Bayes, and SVM across multiple datasets, incorporating feature engineering, bagging, and... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

2. What advances do deep learning and hybrid attention mechanisms offer for detecting spam in email data?

This theme investigates the application of deep learning architectures—especially models integrating convolutional neural networks (CNN), gated recurrent units (GRU), and attention mechanisms—for email spam filtering. These methods focus on hierarchical feature extraction and contextual weighting of informative text segments, aiming to overcome limitations of classical techniques and improve generalization across datasets. The novelty lies in capturing complex semantic structures and temporal dependencies in email content, a crucial advance given the linguistic complexity of spam emails.

Email Spam Detection Using Hierarchical Attention Hybrid Deep Learning Method

by Sultan Zavrak

2023, Email Spam Detection Using Hierarchical Attention Hybrid Deep Learning Method

Key finding: This research proposed a hybrid model combining CNN, GRU, and hierarchical attention mechanisms, which selectively focused on relevant email text parts during training. The temporal convolution layers enabled flexible... Read more

articleView Paper downloadDownload

A Novel Fuzzy-Logic-Based Multi-Criteria Metric for Performance Evaluation of Spam Email Detection Algorithms

by Rehan Akbar

2023, Applied Sciences

Key finding: The study applied machine learning techniques to identify recurrent word groups characteristic of spam and introduced a feedback-trained model with tokenizers and Naïve Bayes classifiers to distinguish between spam and ham... Read more

articleView Paper downloadDownload

Context and Machine Learning Based Trust Management Framework for Internet of Vehicles

by Rehan Akbar

2023, Computers, Materials & Continua

Key finding: This research demonstrated the application of neural networks to email spam filtering, showing their capability to learn complex patterns and outperform standard classifiers in accuracy metrics, specifically for phishing and... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

3. How effective are email service providers' pre-acceptance spam filtering techniques and what are the limitations?

This research area delves into the strategies employed by major email providers like Gmail, Yahoo, and Outlook at the SMTP pre-acceptance stage to filter spam, including blacklists, whitelists, and sender reputation analysis. It quantifies the proportion of spam and legitimate emails filtered before message acceptance and analyses the challenges posed by sophisticated spam gangs and end-host spammers. Understanding these filtering boundaries is vital for optimizing server resources and enhancing spam mitigation strategies.

On the Effectiveness of Pre-Acceptance Spam Filtering

by Zhuoqing Mao

2023

Key finding: Through a large-scale empirical study using millions of emails collected at UW-Madison, the authors found that pre-acceptance filtering methods, such as blacklists and whitelists constructed from sender-tracking heuristics,... Read more

articleView Paper downloadDownload

Machine learning for email spam filtering: review, approaches and open research problems

by Emmanuel Gbenga Dada

2021, Heliyon

Key finding: This review synthesizes state-of-the-art machine learning implementations in major email providers’ spam filters, highlighting Google's advanced neural network-based filtering achieving ~99.9% accuracy. It details innovative... Read more

articleView Paper downloadDownload

All papers in Email Spam Filter

Email Deliverability in Newsletter Mailing vs. Traditional Email Marketing

by Javier Garcia

2025, Email Deliverability

Email deliverability is a critical factor in digital communication, determining whether messages reach recipients’ primary inboxes or are relegated to spam and promotional folders. This paper examines deliverability in the context of both... more

descriptionView Paper arrow_downwardDownload

Postmasters didn't panic? When email is the same as snail mail

by Terence Rajivan Edward

2025

When email became more widely used, in the late 1990s, I heard the term “snail mail” to refer to mail by post. Whereas anyone online could almost instantly send a little letter to someone else online, or a big one, if one uses traditional... more

descriptionView Paper arrow_downwardDownload

Email Classification Using Behavior and Time Features

by Peter Norrington

2025, Journal of Internet Technology

The various forms and tremendous number of spam emails have brought great challenges to accurate email classification. In this paper, we present a behavior- and time-feature-based email classification method. Based on email logs, email... more

descriptionView Paper arrow_downwardDownload

Phishing Image Spam Classification Research Trends: Survey and Open Issues

by fatimah khalid

2025, International Journal of Advanced Computer Science and Applications

A phishing email is an attack that focused completely on people to circumvent existing traditional security algorithms. The email appears to be a dependable, appropriate, and solid communication medium for internet users. At present, the... more

descriptionView Paper arrow_downwardDownload

SMS Spam Detection Using Machine Learning: An Experimental Study

by WARSE The World Academy of Research in Science and Engineering

2025, International Journal of Emerging Trends in Engineering Research

The exponential growth of mobile communication has intensified the threat of SMS spam, compromising user security and trust in messaging platforms. This study addresses this challenge by designing and deploying a robust spam detection... more

descriptionView Paper arrow_downwardDownload

Character Recognition using Support Vector Classifier (SVC)

by Sakshi Panwar

2025

In this paper, the clustering Algorithm known as Support Vector Classifier (SVC) is used. SVC offers classifiers such as logistic regression and decision trees that provide very high accuracy compared to others. The model first... more

descriptionView Paper arrow_downwardDownload

Spam filtering using hybrid local-global Naive Bayes classifier

by Rohit Solanki

2025, 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

descriptionView Paper arrow_downwardDownload

Spam: A Big Data Challenge

by Dr. K. Poulose Jacob

2025, International Journal of Advanced Research in Computer Science

Spam consists of varieties of contents like text, image, embedded HTML, MIME attachments and also the volume of spam mails sent per day is massive. To handle this high volume, high velocity and large varieties of spam, a scalable spam... more

descriptionView Paper arrow_downwardDownload

E-Mail Spam Detection using Machine Learning and Deep Learning

by Shivam Pandey

2025, International Journal for Research in Applied Science and Engineering Technology

descriptionView Paper arrow_downwardDownload

EXPLORING ARTIFICIAL INTELLIGENCE: FOUNDATIONS, DATA ANALYSIS, AND MACHINE LEARNING IMPLEMENTATION USING PYTHON SESSION -2023-25 SUBMITTED IN SRISIIM-BHARATI

by Neelam Gupta

2025

this research project acts as a foundational guide for students and early-career professionals interested in AI. It combines academic theory, technical practice, and ethical reflection to provide a holistic understanding of AI and machine... more

descriptionView Paper arrow_downwardDownload

Spammer Detection and Fake User Identification on Social Networks

by Kulasekhar Ramanaboyina

2025, Arjun College of Technology

This paper presents a framework to detect spammers and fake users on social networking platforms using machine learning algorithms. With the rise in usage of platforms like Twitter and Facebook, malicious activities like spam posting and... more

descriptionView Paper arrow_downwardDownload

Email Spam Detector Research Paper

by Ajinkya Pratap Singh

2025

The widespread use of email as a primary communication medium has led to an increase in spam messages, which pose significant threats to privacy, productivity, and cybersecurity. Spam emails, often disguised as legitimate messages, can... more

descriptionView Paper arrow_downwardDownload

Optimizing Spam Email Detection Accuracy Using Advanced Machine Learning Techniques

by Manish Sharma

2025, IJSDR

The number of email users is increasing every day worldwide. If you have to communicate officially with someone, whether in a business matter or with someone else, your electronic mail is the best option. When identifying the emails,... more

descriptionView Paper arrow_downwardDownload

Evaluating spam filters and Stylometric Detection of AI-generated phishing emails

by Paolo Modesti

2025, Expert Systems With Applications

The advanced architecture of Large Language Models (LLMs) has revolutionised natural language processing, enabling the creation of text that convincingly mimics legitimate human communication, including phishing emails. As AI-generated... more

descriptionView Paper arrow_downwardDownload

Trust, Security and Privacy in Global Computing

by Jean-Marc Seigneur

2025

During the past thirty years, the world of computing has evolved from large centralised computing centres to an increasingly distributed computing environment, where computation and communication capabilities are being embedded in artefacts of everyday life. Billions of computational entities will interact in systems with ever changing configurations determined by local and global context, for example, the location of the user. In such dynamic environments, users would be overwhelmed if involved in computing-related decisions every time the context changes. Due to the number of decisions required to sustain continuous service, most decisions will have to be made by the computing entities themselves. Moreover, due to the global scale of the environment and the potential risk of disconnected operations, the computing entities may have to make these decisions autonomously, without relying on a given fixed infrastructure. Knowledge, especially about the context of the interaction, is vital for the accuracy of these decisions. However, keeping information on a global scale is unfeasible for resource-constrained entities, so some degree of uncertainty must be assumed. This peer-to-peer type of interaction in an uncertain world where interactions are needed to go forward resembles what occurs in human social networks. The notion of trust has emerged in human society to allow humans to make decisions under such circumstances. It has been proposed that computing entities can make decisions based on a computational model of trust. The trust engine run by each entity distributes and gathers pieces of evidence, that is, knowledge about the interacting entities: direct observations, recommendations or reputation. Since the trust engines collaborate and malicious collaborating entities exist, security through collaboration must be considered. As the real world does not have a unique legitimate authority, computing entities are owned by multiple authorities and operated from multiple jurisdictions. As in real life, no administrator can be perpetually present to manage the interactions. The trust engine can adapt security in a peer-to-peer way. A crucial element for the use of trust is to know with whom the entities interact, which corresponds to authentication in traditional computer security. However, this element has been disregarded in computational trust: this is ill-fated given that virtual identities are the means for a number of attacks that are less possible in face-to-face settings. This thesis sets up a framework, called entification, which encompasses both computational trust and identity aspects, and whose goal is to be applicable to global computing. For this purpose, this thesis

descriptionView Paper arrow_downwardDownload

Comparative Analysis of Fake Account Detection using Machine Learning Algorithms

by Padmavathi Anbarasan

2025, IEEE

With the rise of Social Media platforms and new applications, the Rapid Expansion of fake accounts has become an important concern, posing threats to security, privacy and trustworthiness. In response, this research explores the... more

descriptionView Paper arrow_downwardDownload

AI to detect Phishing

by Dev Mandora

2025

The internet has made it easier than ever to connect and do business. But it's also given bad guys more ways to trick people. Phishing is a common online scam where criminals try to get your personal information, like passwords and bank... more

descriptionView Paper arrow_downwardDownload

A Comparative Study of Machine Learning Classifiers for Different Language Spam SMS Detection: Performance Evaluation and Analysis

by Samrat Kumar Dev Sharma

2025, Advances in Artificial Intelligence Research

With the continuous rise in the number of mobile device users, SMS (Short Message Service) remains a prevalent communication tool accessible on both smartphones and basic phones. Consequently, SMS traffic has experienced a significant... more

descriptionView Paper arrow_downwardDownload

Development of a Machine Learning Model for Image-based Email Spam Detection

by Christopher U . ONOVA

2025, FUOYE Journal of Engineering and Technology,

Combatting email spam has remained a very daunting task. Despite the over 99% accuracy in most non-image-based spam email detection, studies on image-based spam hardly attain such a high level of accuracy as new email spamming techniques... more

descriptionView Paper arrow_downwardDownload

CLARIT experiments in batch filtering: term selection and threshold optimization in IR and SVM Filters

by Victor Sheftel

2025, NIST SPECIAL …

descriptionView Paper arrow_downwardDownload

Benchmarking Machine Learning Techniques for Phishing Detection and Secure URL Classification

by IJCSMC Journal and

2025, International Journal of Computer Science and Mobile Computing (IJCSMC)

Phishing is still one of the biggest threats in cybersecurity. It is the exploitation of users through the use of deceptive URLs. In this study, the outcomes of the Random Forest, Support Vector Machines, and Decision Tree models are... more

descriptionView Paper arrow_downwardDownload

Email Spam Classification using Neighbor Probability based Naïve Bayes Algorithm

by Dr.D.Suresh Babu

2024

Email spam is a kind of electronic spam, which tends to be a more difficult problem nowadays among all internet challenges. Spam mails are mostly sent in commercial purpose, some of them may contain malware links that lead to phishing... more

descriptionView Paper arrow_downwardDownload

Lecture Notes in Networks and Systems

by Joyece Jane

2024

The series "Lecture Notes in Networks and Systems" publishes the latest developments in Networks and Systems-quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of... more

descriptionView Paper arrow_downwardDownload

An Anti-Filtering System Using a Hybrid Machine Learning Algorithm Based on a Variant of PSO

by Dr.D.Suresh Babu

2024

The use of email has grown exponentially over the past decade, making it one of the most widely used forms of electronic communication. Recently, spam emails have become a major issue for email users. A spammer is someone who sends out... more

descriptionView Paper arrow_downwardDownload

Machine learning for email spam filtering: review, approaches and open research problems

by Stephen Joseph

2024, Heliyon

The upsurge in the volume of unwanted emails called spam has created an intense need for the development of more dependable and robust antispam filters. Machine learning methods of recent are being used to successfully detect and filter... more

descriptionView Paper arrow_downwardDownload

Comparison of Algorithms on Machine Learning For Spam Email Classification

by Erni Seniwati

2024, IJISTECH (International Journal of Information System and Technology)

The rapid development of email use and the convenience provided make email as the most frequently used means of communication. Along with its development, many parties are abusing the use of email as a means of advertising promotion,... more

descriptionView Paper arrow_downwardDownload

Email Classification into Relevant Category Using Neural Networks

by Shruti Goyal

2024

In the real world, many online shopping websites or service provider have single email-id where customers can send their query, concern etc. At the back-end service provider receive million of emails every week, how they can identify... more

descriptionView Paper arrow_downwardDownload

A New Feature Selection in Email Spam Detection by Particle Swarm Optimization and Fruit Fly Optimization Algorithms

by Journal of Computer and Knowledge Engineering

2024, Journal of Computer and Knowledge Engineering,

With the advent of the internet, along with email, and social networking, there are some new issues that have caused vulnerability of users against attackers. Internet users face a lot of undesirable emails and their data privacy and... more

descriptionView Paper arrow_downwardDownload

Using Supervised Machine Learning Classifiers to Identify Fake Reviews in Social Media and E-Commerce Websites

by MAYSARA M B ALSAAD

2024, 4TH The International Conference on Artificial Intelligence, 5G Communications, and Network Technologies (ICA5NT-2024), Velammal Institute of Technology, Department of Electronics and Communication Engineering & Department of Information Technology, Tamil Nadu, India.

Social media and e-commerce platforms have led online communities to utilize reviews as a means of exchanging opinions about products, services, and issues. Reviews can also help customers make better purchasing decisions by analyzing... more

descriptionView Paper arrow_downwardDownload

Use of Supervised Machine Learning Classifiers for Online Fake Review Detection

by MAYSARA M B ALSAAD

2024, Journal of Applied Optics, Publisher: ying yong guang xue bian ji bu, ISSN: 1002-2082

Social media and e-commerce sites have prompted online communities to use reviews to provide input on goods, products, and services, as well as to support people to analyze customer input for buying choices, and corporations to improve... more

of the dataset is used for training data. In JOURNAL OF APPLIED OPTICS

Figure 1. Structure of suggested spam conducted.

Table 5. Support Vector Machine Classifier

Table 6. Various machine learning classifier JOURNAL OF APPLIED OPTICS

from the test set. Table 7 gives the details vocabulary represented in n-words extracted from the test set. Table 7 gives the details

The SVM classifier approach for fake review detection yielded encouraging JOURNAL OF APPLIED OPTICS

descriptionView Paper arrow_downwardDownload

PCSVD: A hybrid feature extraction technique based on principal component analysis and singular value decomposition

by Neeraj Raheja

2024, Journal of Autonomous Intelligence

Feature extraction plays an important role in accurate preprocessing and real-world applications. High-dimensional features in the data have a significant impact on the machine learning classification system. Relevant feature extraction... more

descriptionView Paper arrow_downwardDownload

E-Mail Spam Detection using Machine Learning and Deep Learning

by Shivam Pandey

2024, International Journal for Research in Applied Science and Engineering Technology

Here we present an inclusive review of recent and successful content-based e-mail spam filtering techniques. Our focus is primarily on machine learning-based spam filters and variants that are inspired by them. We report on related ideas,... more

descriptionView Paper arrow_downwardDownload

An Efficient Approach : Deep Learning Based Model for Adaptive SPAM Detection in IoT Networks

by International Journal of Scientific Research in Computer Science, Engineering and Information Technology IJSRCSEIT

2024, International Journal of Scientific Research in Computer Science, Engineering and Information Technology

The Spamming is the use of messaging systems to send multiple unsolicited messages (spam) to large numbers of recipients for the purpose of commercial advertising, for the purpose of non-commercial proselytizing, for any prohibited... more

Figure 2 is presenting the training of the IOT devices. It consider from the month of jan to end of the dec. from the jan to dec. count with the month format. The month consider

Figure is 4 is showing actual and the average value during the moving average window size 12.

Figure is 7 is showing predicted and actual data from the given dataset. It is clear from the result graph; the

Figure is 9 is showing energy prediction. The original and predicted energy is shown in the graphical form.

Figure 5.26 : Accuracy Result graph achieved by the proposed work is 94.25 % while previous it is achieved 92.8 %. The error rate of proposed technique is 4.37 % while 8.2 % in existing work. Therefore it is clear from the simulation results; the proposed work is achieved significant better results thanexisting work.

descriptionView Paper arrow_downwardDownload

A Hybrid Spam Filtering Technique Using Bayesian Spam Filters and Artificial Immunity Spam Filters

by Smera Rockey

2024

Spam emails are one of the crucial problems faced by most of the email users. There are a lot of algorithms to filter spam mails from ham mails. In this paper two efficient filters-Bayesian filters and Artificial Immunity filters are... more

descriptionView Paper arrow_downwardDownload

Weighted k-Nearest Neighbour for Image Spam Classification

by Ban N . Dhannoon

2024, Iraqi journal of science

E-mail is an efficient and reliable data exchange service. Spams are undesired email messages which are randomly sent in bulk usually for commercial aims. Obfuscated image spamming is one of the new tricks to bypass text-based and Optical... more

descriptionView Paper arrow_downwardDownload

Clustered negative selection algorithm and fruit fly optimization for email spam detection

by chikh ramdane

2024, Journal of Ambient Intelligence and Humanized Computing

At present, spam is an actual and increasing problem that compromises email communications across the world. Thus, several solutions have been proposed to stop or reduce the amount of this threat. However, methods based on negative... more

descriptionView Paper arrow_downwardDownload

MMPC-RF: A Deep Multimodal Feature-Level Fusion Architecture for Hybrid Spam E-mail Detection

by Ali Yahyaouy

2024, Applied Sciences

Hybrid spam is an undesirable e-mail (electronic mail) that contains both image and text parts. It is more harmful and complex as compared to image-based and text-based spam e-mail. Thus, an efficient and intelligent approach is required... more

descriptionView Paper arrow_downwardDownload

Machine-Learning-Based Spam Mail Detector

by Sandeep Sengar

2024, SN Computer Science

The proliferation of spam emails, a predominant form of online harassment, has elevated the significance of email in daily life. As a consequence, a substantial portion of individuals remain vulnerable to fraudulent activities. Despite... more

Fig.4 Most occurred word cloud of spam mail Fig. 3 Relation between word in spam and ham mail in red and yel- low respectively Storage: Software:

Table 1 Compare between Logistic and Random Forest No_of Char No of Weeds No_of Sentance

Fig. 2 Quick overview of our dataset (both ham and spam mail) Findings

Table 3 Accuracy results of 11 different types of algorithms

Table 2 Accuracy of the algorithm Bayes are 0.959381 and | respectively, which is better than all other algorithms, as shown in the table below, which compares each algorithm we tested with our refined dataset. After making numerous changes to our Nave based algo- rithm, we were able to achieve the accuracy of 97.1954%. And we’re still aiming to get 100% accuracy. Table 13 shows Comparative study with other algorithms.

Table 4 Results based on accuracy and precision

Table 5 Description of the combined data

Table 6 Description of spam data Table 7 Description of ham data

Table 10 Relationship between char word and sentence in spam data Table 11 Relationship between char word and sentence in ham data

descriptionView Paper arrow_downwardDownload

[IJCST-V12I1P3]:Ipsita Panda, Sidharth Dash

by IJCST Eighth Sense Research Group

2024

With the rapid increase in internet users, e-mail spam is also increasing, which has become a major problem. Now a days, emails have two subcategories: spam and ham. In addition to harming the system, malicious link senders via spam... more

descriptionView Paper arrow_downwardDownload

Comparison of Three Machine Learning Models for the Detection of Emails Spam

by Raed Alkaied

2024, Research Square (Research Square)

Recently, machine learning has been applied into different major areas such as text classi cation, machine translation, and spam detection. The great performance of machine learning algorithms into several elds provided the humans with... more

descriptionView Paper arrow_downwardDownload

Comparison of Three Machine Learning Models for the Detection of Emails Spam

by Raed Alkaied

2024

Recently, machine learning has been applied into different major areas such as text classification, machine translation, and spam detection. The great performance of machine learning algorithms into several fields provided the humans with... more

descriptionView Paper arrow_downwardDownload

Sorting the Digital Stream: Big Data-driven Insights into Email Classification for Spam and Ham Detection

by Joyece Jane

2024

In contemporary email communication, the everexpanding volume of digital correspondence has ushered in an era where big data plays a pivotal role in addressing the challenge of distinguishing between legitimate (ham) and unsolicited... more

Fig. 1: Proposed framework: Architecture and workflow for machine learning-based email ham and spam detection.

Fig. 2: With feature selection ANN on Spambase dataset. As depicted in Figure 2, two filter feature selection methods, namely Pearson correlation and chi-squared, were applied for the neural network classification. Surprisingly, the perfor- mance metric score was found to be poor, with the accuracy dropping by 72% compared to the accuracy achieved without feature selection. This outcome leads us to conclude that for neural network algorithms, the feature selection technique utilized did not improve performance, but instead resulted in a significant decrease in accuracy.

Fig. 3: Without feature selection ANN on Spambase dataset.

Fig. 4: With feature selection DT on Spambase dataset.

Fig. 5: Without feature selection DT on Spambase dataset

Fig. 6: Comparing classification metrics across varied algorithms.

TABLE III: Comparative analysis with other studies Particularly striking is the discernible impact on the different datasets, complemented by preprocessing and feature selec- tion, achieving remarkable accuracy scores that owe their success to the strategic implementation of feature selection and data preprocessing techniques in the proposed framework.

descriptionView Paper arrow_downwardDownload

Spam Mails Filtering Using Different Classifiers with Feature Selection and Reduction Technique

by Renuka Yadav

2024, 2015 Fifth International Conference on Communication Systems and Network Technologies

The continuous growth of email users has resulted in the increasing of unsolicited emails also known as Spam. In current, server side and client side anti spam filters are introduced for detecting different features of spam emails.... more

descriptionView Paper arrow_downwardDownload

Take Control of Your SMSes: Designing an Usable Spam SMS Filtering System

by Kuldeep Yadav

2023, 2012 IEEE 13th International Conference on Mobile Data Management

Short Message Service (SMS) is one of the most frequently used services in the mobile phones, next to calls. In developing countries like India, SMS is the cheapest mode of communication. The advantage of this fact is exploited by the... more

descriptionView Paper arrow_downwardDownload

Genetic Programming With CNN Optimization For Financial Fraud Detection

by Kesava Rao Alla

2023, Journal of Harbin Engineering University

Financial fraud detection poses a critical challenge in the contemporary digital economy due to its potential to inflict substantial harm on individuals, businesses, and financial institutions. In this research, we introduce an innovative... more

descriptionView Paper arrow_downwardDownload

Take Control of Your SMSes: Designing an Usable Spam SMS Filtering System

by Dipesh Singh

2023, 2012 IEEE 13th International Conference on Mobile Data Management

descriptionView Paper arrow_downwardDownload

Machine intelligence-based algorithms for spam filtering on document labeling

by Ayush Goyal

2023, Soft Computing

The internet has provided numerous modes for secure data transmission from one end station to another, and email is one of those. The reason behind its popular usage is its cost-effectiveness and facility for fast communication. In the... more

E Enron dataset, LS Ling-Spam dataset To judge the performance of the algorithm, the two important questions arises here are: (1) Are the features defined in a well manner? (2) Do the amount of data required during the training phase are sufficient? To answer such questions, the possible solution is to check the size of the dataset during training time. With the increase in the size of training data, it becomes very much complex for a

Fig. 4 Comparative study of different machine learning algorithms. E Enron dataset, LS Ling-Spam dataset E Enron dataset, LS Ling-Spam dataset, Info Gain information gain, SBS sequential backward selection, EFS exhaustive feature selection, SFS sequential forward selection, BFS best first search, WFS wrapper- based feature selection RF is the best among the rest classifiers in the presence of with and without the strategies used for selection of features. From the outcomes, it tends to be seen that with the strategies used for selection of features such as sequential feature selector, GridSearchCV and exhaustive feature selector, best first search has performed in a good manner in comparison with others—aside from the NB in the Enron dataset

Fig. 5 Four cross-fold validation techniques. A—trained features, B—test features, S()—selection() and TS—training set

Email system serves as one of the most powerful com- munication systems for transmitting the user’s information from one to another. It includes not only text but also images, files, etc. This platform helps the user in saving a huge amount of time and money in comparison with out- dated techniques like telegrams, etc. Nowadays, approxi- mately 281 billion emails are transmitted all over the world in our day-to-day life (https://cacm.acm.org/magazines/

Table 2 Feature selection schemes the total number of mails, w is the word, P(H) is the probability for ham feature and P(S) is the probability for spam feature. This is clearly shown in Eq. (5) and Eq. (6), respectively.

descriptionView Paper arrow_downwardDownload

Machine intelligence-based algorithms for spam filtering on document labeling

by Ayush Goyal

2023, Soft Computing

descriptionView Paper arrow_downwardDownload

SMS Spam Detection and Classification to Combat Abuse in Telephone Networks Using Natural Language Processing

by Dare Oyeyemi and

2023, Journal of Advances in Mathematics and Computer SciencJournal of Advances in Mathematics and Computer Science

In the modern era, mobile phones have become ubiquitous, and Short Message Service (SMS) has grown to become a multi-million-dollar service due to the widespread adoption of mobile devices and the millions of people who use SMS daily.... more

Spam detection has traditionally relied on keyword filters to differentiate between spam and legitimate messages for the past two decades [4]. More recently, advanced methods like Statistical Learning Theory, Artificial Neural Networks (ANNs), and Support Vector Machines (SVM) have emerged. However, according to [5] these newer techniques exhibit inconsistent performance across different training datasets without logical or apparent explanation. There are numerous spam filtering techniques, however, because each of these techniques has strengths and drawbacks, no single spam filtering strategy can be guaranteed to be 100% effective at eradicating spam issues. The application of text mining techniques to SMS will improve the effectiveness of detecting and classifying spam messages [6], which will reduce telephone network abuse. There are now a staggering number of different forms of SMS, and a different technique that can effectively classify SMS with low latency must be proposed. Fig. 1. Category and Percentage (%) of Top SMS Spammers in Nigeria (www.technext24.com/2019/12/04/truecaller-report)

Fig. 2. Truecaller Average spam SMS per user/month in Africa (https://pmnewsnigeria.com/1029/03_02)

The research design and approach for this study used a cross-sectional research method that involves collecting and analyzing data at a specific time. The research design is structured into: data collection, data preprocessing, feature extraction, and classification. The study aims to develop a model for SMS classification and detection that can be used to combat abuse in a telephone network.

BERT model preprocessor and encoder aided the conversion of the dataset into a numerical format used for the various downstream natural language processing tasks in this study. The preprocessor prepared the input data for the BERT model by performing tokenization, adding special tokens such as [CLS] and [SEP], segmenting the input data into separate sequences, masking tokens, and padding the input sequence to a fixed length. Fig/ 4 and 5 shows how the pre-trained BERT model process text data.

Fig. 5. Conversion of Text Data to Contextual Embedding in BERT Model

Fig. 6. Flowchart of Rule-Based Approach to SMS Spam The system evaluates incoming SMS message to determine if they matched any of the predefined rules. If a message match one or more of the above predefined rules, it is flag as spam; otherwise, it is considered ham or non-spam message. While the rule-based filtering approach is effective at identifying and classifying obvious spam messages with known characteristics, it may not capture all types of spam, especially those employing sophisticated or evolving tactics. To enhance accuracy, the remaining messages were classified using machine learning-based techniques.

Table 3. Comparison of existing models with proposed model Table 2. Confusion matrix of utilized Models

descriptionView Paper arrow_downwardDownload

Empirical Analysis of Machine Learning Algorithms for Multiclass Prediction

by Danial Shabbir

2023, Wireless Communications and Mobile Computing

With the emergence of big data and the interest in deriving valuable insights from ever-growing and ever-changing streams of data, machine learning has appeared as an effective data analytic technique as compared to traditional... more

descriptionView Paper arrow_downwardDownload

Email Spam Filter

Key research themes

1. How can machine learning algorithms and feature engineering optimally detect and classify email spam?

2. What advances do deep learning and hybrid attention mechanisms offer for detecting spam in email data?

3. How effective are email service providers' pre-acceptance spam filtering techniques and what are the limitations?

Related Topics

All papers in Email Spam Filter