URLCam: Toolkit for malicious URL analysis and modeling

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189874 ◽

2021 ◽

pp. 1-15

Author(s):

Mohammed Ayub ◽

El-Sayed M. El-Alfy

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Feature Selection ◽

State Of The Art ◽

Learning Algorithms ◽

Extraction Methods ◽

Machine Learning Algorithms ◽

The Other ◽

Imbalanced Learning ◽

Almost All

Web technology has become an indispensable part in human’s life for almost all activities. On the other hand, the trend of cyberattacks is on the rise in today’s modern Web-driven world. Therefore, effective countermeasures for the analysis and detection of malicious websites is crucial to combat the rising threats to the cyber world security. In this paper, we systematically reviewed the state-of-the-art techniques and identified a total of about 230 features of malicious websites, which are classified as internal and external features. Moreover, we developed a toolkit for the analysis and modeling of malicious websites. The toolkit has implemented several types of feature extraction methods and machine learning algorithms, which can be used to analyze and compare different approaches to detect malicious URLs. Moreover, the toolkit incorporates several other options such as feature selection and imbalanced learning with flexibility to be extended to include more functionality and generalization capabilities. Moreover, some use cases are demonstrated for different datasets.

Download Full-text

Performance Analysis of Machine Learning Algorithms and Feature Extraction Methods for Sentiment Analysis

10.1109/icses52305.2021.9633882 ◽

2021 ◽

Author(s):

Anshumaan Chauhan ◽

Ayushi Agarwal ◽

Razia Sulthana

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Performance Analysis ◽

Sentiment Analysis ◽

Learning Algorithms ◽

Extraction Methods ◽

Machine Learning Algorithms

Download Full-text

Amino Acid k-mer Feature Extraction for Quantitative Antimicrobial Resistance (AMR) Prediction by Machine Learning and Model Interpretation for Biological Insights

Biology ◽

10.3390/biology9110365 ◽

2020 ◽

Vol 9 (11) ◽

pp. 365

Author(s):

Taha ValizadehAslani ◽

Zhengqiao Zhao ◽

Bahrad A. Sokhansanj ◽

Gail L. Rosen

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Amino Acid ◽

Computational Complexity ◽

Antimicrobial Resistance ◽

Learning Algorithms ◽

Extraction Methods ◽

Machine Learning Algorithms ◽

Model Interpretation ◽

New Feature

Machine learning algorithms can learn mechanisms of antimicrobial resistance from the data of DNA sequence without any a priori information. Interpreting a trained machine learning algorithm can be exploited for validating the model and obtaining new information about resistance mechanisms. Different feature extraction methods, such as SNP calling and counting nucleotide k-mers have been proposed for presenting DNA sequences to the model. However, there are trade-offs between interpretability, computational complexity and accuracy for different feature extraction methods. In this study, we have proposed a new feature extraction method, counting amino acid k-mers or oligopeptides, which provides easier model interpretation compared to counting nucleotide k-mers and reaches the same or even better accuracy in comparison with different methods. Additionally, we have trained machine learning algorithms using different feature extraction methods and compared the results in terms of accuracy, model interpretability and computational complexity. We have built a new feature selection pipeline for extraction of important features so that new AMR determinants can be discovered by analyzing these features. This pipeline allows the construction of models that only use a small number of features and can predict resistance accurately.

Download Full-text

Machine Learning for Analyzing Malware

Journal of Cyber Security and Mobility ◽

10.13052/2245-1439.631 ◽

2017 ◽

Author(s):

Zhenyan Liu ◽

Yifei Zeng ◽

Yida Yan ◽

Pengfei Zhang ◽

Yong Wang

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Feature Selection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

The Internet ◽

Malware Analysis ◽

Analysis Process ◽

Original Feature ◽

Work And Life

The Internet has become an indispensable part of people’s work and life, but it also provides favorable communication conditions for malwares. Therefore, malwares are endless and spread faster and become one of the main threats of current network security. Based on the malware analysis process, from the original feature extraction and feature selection to malware analysis, this paper introduces the machine learning algorithms such as classification, clustering and association analysis, and how to use these machine learning algorithms to effectively analyze the malware and its variants.

Download Full-text

A Novel Unsupervised Machine Learning-Based Method for Chatter Detection in the Milling of Thin-Walled Parts

Sensors ◽

10.3390/s21175779 ◽

2021 ◽

Vol 21 (17) ◽

pp. 5779

Author(s):

Runqiong Wang ◽

Qinghua Song ◽

Zhanqiang Liu ◽

Haifeng Ma ◽

Munish Kumar Gupta ◽

...

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Learning Algorithms ◽

Bending Moment ◽

Extraction Methods ◽

Machine Learning Algorithms ◽

Chatter Detection ◽

Unsupervised Machine Learning ◽

Milling Chatter ◽

Better Than

Data-driven chatter detection techniques avoid complex physical modeling and provide the basis for industrial applications of cutting process monitoring. Among them, feature extraction is the key step of chatter detection, which can compensate for the accuracy disadvantage of machine learning algorithms to some extent if the extracted features are highly correlated with the milling condition. However, the classification accuracy of the current feature extraction methods is not satisfactory, and a combination of multiple features is required to identify the chatter. This limits the development of unsupervised machine learning algorithms for chattering detection, which further affects the application in practical processing. In this paper, the fractal feature of the signal is extracted by structure function method (SFM) for the first time, which solves the problem that the features are easily affected by process parameters. Milling chatter is identified based on k-means algorithm, which avoids the complex process of training model, and the judgment method of milling chatter is also discussed. The proposed method can achieve 94.4% identification accuracy by using only one single signal feature, which is better than other feature extraction methods, and even better than some supervised machine learning algorithms. Moreover, experiments show that chatter will affect the distribution of cutting bending moment, and it is not reliable to monitor tool wear through the polar plot of the bending moment. This provides a theoretical basis for the application of unsupervised machine learning algorithms in chatter detection.

Download Full-text

Sentiment Analysis of Movie Reviews: A Study of Machine Learning Algorithms with Various Feature Selection Methods

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v5i9.113121 ◽

2017 ◽

Vol 5 (9) ◽

Cited By ~ 1

Author(s):

Rajwinder Kaur

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Sentiment Analysis ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Selection Methods

Download Full-text

Feature Selection with Fast Correlation-Based Filter for Breast Cancer Prediction and Classification Using Machine Learning Algorithms

2018 International Symposium on Advanced Electrical and Communication Technologies (ISAECT) ◽

10.1109/isaect.2018.8618688 ◽

2018 ◽

Author(s):

Youness Khourdifi ◽

Mohamed Bahaj

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Feature Selection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Cancer Prediction

Download Full-text

Comparative study on total nitrogen prediction in wastewater treatment plant and effect of various feature selection methods on machine learning algorithms performance

Journal of Water Process Engineering ◽

10.1016/j.jwpe.2021.102033 ◽

2021 ◽

Vol 41 ◽

pp. 102033

Author(s):

Faramarz Bagherzadeh ◽

Mohamad-Javad Mehrani ◽

Milad Basirifard ◽

Javad Roostaei

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Wastewater Treatment ◽

Comparative Study ◽

Total Nitrogen ◽

Wastewater Treatment Plant ◽

Learning Algorithms ◽

Treatment Plant ◽

Machine Learning Algorithms ◽

Selection Methods

Download Full-text

Comparison of Machine Learning Algorithms to Recognize Human Activities from Images and Videos Using Pose Estimation and Feature Extraction

Proceedings of the Future Technologies Conference (FTC) 2020, Volume 1 - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-030-63128-4_7 ◽

2020 ◽

pp. 78-87

Author(s):

Md Hasibul Huq ◽

Mohammed Alnakli ◽

Zakiya Jafrin ◽

Tanjima Nasreen Jenia

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Pose Estimation ◽

Human Activities ◽

Learning Algorithms ◽

Machine Learning Algorithms

Download Full-text

A REVIEW OF FEATURE EXTRACTION METHODS ON MACHINE LEARNING

Journal of Information System and Technology Management ◽

10.35631/jistm.622005 ◽

2021 ◽

Vol 6 (22) ◽

pp. 51-59

Author(s):

Mustazzihim Suhaidi ◽

Rabiah Abdul Kadir ◽

Sabrina Tiun

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Feature Selection ◽

Input Data ◽

Feature Vector ◽

Learning Algorithm ◽

Extraction Methods ◽

Machine Learning Algorithm ◽

Learning Tasks ◽

Low Dimensional

Extracting features from input data is vital for successful classification and machine learning tasks. Classification is the process of declaring an object into one of the predefined categories. Many different feature selection and feature extraction methods exist, and they are being widely used. Feature extraction, obviously, is a transformation of large input data into a low dimensional feature vector, which is an input to classification or a machine learning algorithm. The task of feature extraction has major challenges, which will be discussed in this paper. The challenge is to learn and extract knowledge from text datasets to make correct decisions. The objective of this paper is to give an overview of methods used in feature extraction for various applications, with a dataset containing a collection of texts taken from social media.

Download Full-text

A Review on Linear Regression Comprehensive in Machine Learning

Journal of Applied Science and Technology Trends ◽

10.38094/jastt1457 ◽

2020 ◽

Vol 1 (4) ◽

pp. 140-147

Author(s):

Dastan Maulud ◽

Adnan M. Abdulazeez

Keyword(s):

Machine Learning ◽

Linear Regression ◽

Linear Relationship ◽

Multiple Regression ◽

Polynomial Regression ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Explanatory Variables ◽

Almost All ◽

Simple Regression

Perhaps one of the most common and comprehensive statistical and machine learning algorithms are linear regression. Linear regression is used to find a linear relationship between one or more predictors. The linear regression has two types: simple regression and multiple regression (MLR). This paper discusses various works by different researchers on linear regression and polynomial regression and compares their performance using the best approach to optimize prediction and precision. Almost all of the articles analyzed in this review is focused on datasets; in order to determine a model's efficiency, it must be correlated with the actual values obtained for the explanatory variables.

Download Full-text