Machine learning for email spam filtering: review, approaches and open research problems

The upsurge in the volume of unwanted emails called spam has created an intense need for the development of more dependable and robust antispam filters. Machine learning methods of recent are being used to successfully detect and filter spam emails. We present a systematic review of some of the popular machine learning based email spam filtering approaches. Our review covers survey of the important concepts, attempts, efficiency, and the research trend in spam filtering. The preliminary discussion in the study background examines the applications of machine learning techniques to the email spam filtering process of the leading internet service providers (ISPs) like Gmail, Yahoo and Outlook emails spam filters. Discussion on general email spam filtering process, and the various efforts by different researchers in combating spam through the use machine learning techniques was done. Our review compares the strengths and drawbacks of existing machine learning approaches and the open research problems in spam filtering. We recommended deep learning and deep adversarial learning as the future techniques that can effectively handle the menace of spam emails

Download Full-text

Improving Spam Email Filtering Systems Using Data Mining Techniques

Implementing Computational Intelligence Techniques for Security Systems Design - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-2418-3.ch003 ◽

2020 ◽

pp. 43-72

Author(s):

Wasan Shaker Awad ◽

Wafa M. Rafiq

Keyword(s):

Machine Learning ◽

Data Mining ◽

Genetic Algorithm ◽

Low Cost ◽

High Accuracy ◽

False Positives ◽

Spam Filtering ◽

Spam Filter ◽

Using Data ◽

Email Spam

Email is the most popular choice of communication due to its low-cost and easy accessibility, which makes email spam a major issue. Emails can be incorrectly marked by a spam filter and legitimate emails can get lost in the spam folder or the spam emails can deluge the users' inboxes. Therefore, various methods based on statistics and machine learning have been developed to classify emails accurately. In this chapter, the existing spam filtering methods were studied comprehensively, and a spam email classifier based on the genetic algorithm was proposed. The proposed algorithm was successful in achieving high accuracy by reducing the rate of false positives, but at the same time, it also maintained an acceptable rate of false negatives. The proposed algorithm was tested on 2000 emails from the two popular spam datasets, Enron and LingSpam, and the accuracy was found to be nearly 90%. The results showed that the genetic algorithm is an effective method for spam classification and with further enhancements that will provide a more robust spam filter.

Download Full-text

Machine learning for media compression: challenges and opportunities

APSIPA Transactions on Signal and Information Processing ◽

10.1017/atsip.2018.12 ◽

2018 ◽

Vol 7 ◽

Cited By ~ 1

Author(s):

Amir Said

Keyword(s):

Machine Learning ◽

Video Coding ◽

Open Research ◽

Challenges And Opportunities ◽

Research Problems

Machine learning (ML) has been producing major advances in several technological fields and can have a significant impact on media coding. However, fast progress can only happen if the ML techniques are adapted to match the true needs of compression. In this paper, we analyze why some straightforward applications of ML tools to compression do not really address its fundamental problems, which explains why they have been yielding disappointing results. From an analysis of why compression can be quite different from other ML applications, we present some new problems that are technically challenging, but that can produce more significant advances. Throughout the paper, we present examples of successful applications to video coding, discuss practical difficulties that are specific to media compression, and describe related open research problems.

Download Full-text

PERFORMANCE OF MACHINE LEARNING TECHNIQUES FOR EMAIL SPAM FILTERING

International Journal of Recent Trends in Engineering and Research ◽

10.23883/ijrter.conf.20171201.049.yzvdv ◽

2018 ◽

pp. 245-248

Keyword(s):

Machine Learning ◽

Machine Learning Techniques ◽

Spam Filtering ◽

Learning Techniques ◽

Email Spam

Download Full-text

A Machine Learning Based Email Spam Classification Framework Model: Related Challenges and Issues

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.d1561.029420 ◽

2020 ◽

Vol 9 (4) ◽

pp. 3137-3144

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Spam Filtering ◽

Important Work ◽

Classification Framework ◽

Framework Model ◽

Spam Filters ◽

Distinguishing Features ◽

Email Spam

Spam emails, also known as non-self, are unsolicited commercial emails or fraudulent emails sent to a particular individual or company, or to a group of individuals. Machine learning algorithms in the area of spam filtering is commonly used. There has been a lot of effort to render spam filtering more efficient in classifying e-mails as either ham (valid messages) or spam (unwanted messages) through the ML classifiers. We may recognize the distinguishing features of the material of documents. Much important work has been carried out in the area of spam filtering which cannot be adapted to various conditions and problems which are limited to certain domains. Our analysis contrasts the positives methods as well as some shortcomings of current ML methods and open spam filters study challenges. We suggest some of the new ongoing approaches towards deep leaning as potential tactics that can tackle the challenge of spam emails efficiently.

Download Full-text