scholarly journals Machine learning for email spam filtering: review, approaches and open research problems

Heliyon ◽  
2019 ◽  
Vol 5 (6) ◽  
pp. e01802 ◽  
Author(s):  
Emmanuel Gbenga Dada ◽  
Joseph Stephen Bassi ◽  
Haruna Chiroma ◽  
Shafi'i Muhammad Abdulhamid ◽  
Adebayo Olusola Adetunmbi ◽  
...  
Author(s):  
RajKishore Sahni

The upsurge in the volume of unwanted emails called spam has created an intense need for the development of more dependable and robust antispam filters. Machine learning methods of recent are being used to successfully detect and filter spam emails. We present a systematic review of some of the popular machine learning based email spam filtering approaches. Our review covers survey of the important concepts, attempts, efficiency, and the research trend in spam filtering. The preliminary discussion in the study background examines the applications of machine learning techniques to the email spam filtering process of the leading internet service providers (ISPs) like Gmail, Yahoo and Outlook emails spam filters. Discussion on general email spam filtering process, and the various efforts by different researchers in combating spam through the use machine learning techniques was done. Our review compares the strengths and drawbacks of existing machine learning approaches and the open research problems in spam filtering. We recommended deep learning and deep adversarial learning as the future techniques that can effectively handle the menace of spam emails


Author(s):  
Wasan Shaker Awad ◽  
Wafa M. Rafiq

Email is the most popular choice of communication due to its low-cost and easy accessibility, which makes email spam a major issue. Emails can be incorrectly marked by a spam filter and legitimate emails can get lost in the spam folder or the spam emails can deluge the users' inboxes. Therefore, various methods based on statistics and machine learning have been developed to classify emails accurately. In this chapter, the existing spam filtering methods were studied comprehensively, and a spam email classifier based on the genetic algorithm was proposed. The proposed algorithm was successful in achieving high accuracy by reducing the rate of false positives, but at the same time, it also maintained an acceptable rate of false negatives. The proposed algorithm was tested on 2000 emails from the two popular spam datasets, Enron and LingSpam, and the accuracy was found to be nearly 90%. The results showed that the genetic algorithm is an effective method for spam classification and with further enhancements that will provide a more robust spam filter.


Author(s):  
Amir Said

Machine learning (ML) has been producing major advances in several technological fields and can have a significant impact on media coding. However, fast progress can only happen if the ML techniques are adapted to match the true needs of compression. In this paper, we analyze why some straightforward applications of ML tools to compression do not really address its fundamental problems, which explains why they have been yielding disappointing results. From an analysis of why compression can be quite different from other ML applications, we present some new problems that are technically challenging, but that can produce more significant advances. Throughout the paper, we present examples of successful applications to video coding, discuss practical difficulties that are specific to media compression, and describe related open research problems.


Spam emails, also known as non-self, are unsolicited commercial emails or fraudulent emails sent to a particular individual or company, or to a group of individuals. Machine learning algorithms in the area of spam filtering is commonly used. There has been a lot of effort to render spam filtering more efficient in classifying e-mails as either ham (valid messages) or spam (unwanted messages) through the ML classifiers. We may recognize the distinguishing features of the material of documents. Much important work has been carried out in the area of spam filtering which cannot be adapted to various conditions and problems which are limited to certain domains. Our analysis contrasts the positives methods as well as some shortcomings of current ML methods and open spam filters study challenges. We suggest some of the new ongoing approaches towards deep leaning as potential tactics that can tackle the challenge of spam emails efficiently.


Author(s):  
Mangena Venu Madhavan ◽  
Sagar Pande ◽  
Pooja Umekar ◽  
Tushar Mahore ◽  
Dhiraj Kalyankar

Sign in / Sign up

Export Citation Format

Share Document