Application of Machine Learning for Prediction of Lung Cancer using Omics Data

Cancer is one of the deadly diseases across many countries. However, cancer can be cured, if detected at an early stage. Researchers are working on healthcare for early detection and prevention of cancer. Medical data has reached its utmost potential by providing researchers with huge data sets collected from all over the globe. In the present scenario, Machine Learning has been widely used in the area of cancer diagnosis and prognosis. Survival analysis may help in the prediction of the early onset of disease, relapse, re-occurrence of diseases and biomarker identification. Applications of machine learning and data mining methods in medical field are currently the most widespread in cancer detection and survival analysis. In this survey, different ways to detect and predict lung cancer using latest Machine learning algorithms combined with data mining has been analyzed. Comparative study of various machine learning techniques and technologies has been done over different types of data such as clinical data, omics data, image data etc.

Download Full-text

Predicting Student Failure in University Examination using Machine Learning Algorithms

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.e2643.039520 ◽

2020 ◽

Vol 9 (5) ◽

pp. 956-959

Keyword(s):

Machine Learning ◽

Data Mining ◽

Performance Management ◽

Student Performance ◽

Learning Algorithms ◽

Educational Data Mining ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Social Characteristics ◽

Student Failure

Student Performance Management is one of the key pillars of the higher education institutions since it directly impacts the student’s career prospects and college rankings. This paper follows the path of learning analytics and educational data mining by applying machine learning techniques in student data for identifying students who are at the more likely to fail in the university examinations and thus providing needed interventions for improved student performance. The Paper uses data mining approach with 10 fold cross validation to classify students based on predictors which are demographic and social characteristics of the students. This paper compares five popular machine learning algorithms Rep Tree, Jrip, Random Forest, Random Tree, Naive Bayes algorithms based on overall classifier accuracy as well as other class specific indicators i.e. precision, recall, f-measure. Results proved that Rep tree algorithm outperformed other machine learning algorithms in classifying students who are at more likely to fail in the examinations.

Download Full-text

Dr. Phish: Phishing Website Detector

E3S Web of Conferences ◽

10.1051/e3sconf/202129701032 ◽

2021 ◽

Vol 297 ◽

pp. 01032

Author(s):

Harish Kumar ◽

Anshal Prasad ◽

Ninad Rane ◽

Nilay Tamane ◽

Anjali Yeole

Keyword(s):

Machine Learning ◽

Data Mining ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Cyber Crime ◽

Data Mining Algorithms ◽

Learning Techniques ◽

Mining Algorithms ◽

Host Properties ◽

New Strategies

Phishing is a common attack on credulous people by making them disclose their unique information. It is a type of cyber-crime where false sites allure exploited people to give delicate data. This paper deals with methods for detecting phishing websites by analyzing various features of URLs by Machine learning techniques. This experimentation discusses the methods used for detection of phishing websites based on lexical features, host properties and page importance properties. We consider various data mining algorithms for evaluation of the features in order to get a better understanding of the structure of URLs that spread phishing. To protect end users from visiting these sites, we can try to identify the phishing URLs by analyzing their lexical and host-based features.A particular challenge in this domain is that criminals are constantly making new strategies to counter our defense measures. To succeed in this contest, we need Machine Learning algorithms that continually adapt to new examples and features of phishing URLs.

Download Full-text

SeisBench: A toolbox for benchmarking and applying machine learning in seismology.

10.5194/egusphere-egu21-12218 ◽

2021 ◽

Author(s):

Jack Woollam ◽

Jannes Münchmeyer ◽

Carlo Giunchi ◽

Dario Jozinovic ◽

Tobias Diehl ◽

...

Keyword(s):

Machine Learning ◽

Model Comparison ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Quality Data ◽

Data Sets ◽

Waveform Data ◽

Detection Techniques ◽

Benchmark Data

<p>Machine learning methods have seen widespread adoption within the seismological community in recent years due to their ability to effectively process large amounts of data, while equalling or surpassing the performance of human analysts or classic algorithms. In the wider machine learning world, for example in imaging applications, the open availability of extensive high-quality datasets for training, validation, and the benchmarking of competing algorithms is seen as a vital ingredient to the rapid progress observed throughout the last decade. Within seismology, vast catalogues of labelled data are readily available, but collecting the waveform data for millions of records and assessing the quality of training examples is a time-consuming, tedious process. The natural variability in source processes and seismic wave propagation also presents a critical problem during training. The performance of models trained on different regions, distance and magnitude ranges are not easily comparable. The inability to easily compare and contrast state-of-the-art machine learning-based detection techniques on varying seismic data sets is currently a barrier to further progress within this emerging field. We present SeisBench, an extensible open-source framework for training, benchmarking, and applying machine learning algorithms. SeisBench provides access to various benchmark data sets and models from literature, along with pre-trained model weights, through a unified API. Built to be extensible, and modular, SeisBench allows for the simple addition of new models and data sets, which can be easily interchanged with existing pre-trained models and benchmark data. Standardising the access of varying quality data, and metadata simplifies comparison workflows, enabling the development of more robust machine learning algorithms. We initially focus on phase detection, identification and picking, but the framework is designed to be extended for other purposes, for example direct estimation of event parameters. Users will be able to contribute their own benchmarks and (trained) models. In the future, it will thus be much easier to compare both the performance of new algorithms against published machine learning models/architectures and to check the performance of established algorithms against new data sets. We hope that the ease of validation and inter-model comparison enabled by SeisBench will serve as a catalyst for the development of the next generation of machine learning techniques within the seismological community. The SeisBench source code will be published with an open license and explicitly encourages community involvement.</p>

Download Full-text

Heart Disease Prediction Using Machine Learning

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-1131 ◽

2021 ◽

pp. 267-276

Author(s):

Baban. U. Rindhe ◽

Nikita Ahire ◽

Rupali Patil ◽

Shweta Gagare ◽

Manisha Darade

Keyword(s):

Machine Learning ◽

Data Mining ◽

Heart Disease ◽

Heart Diseases ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Whole Body ◽

Support Vector ◽

Learning Techniques

Heart-related diseases or Cardiovascular Diseases (CVDs) are the main reason for a huge number of death in the world over the last few decades and has emerged as the most life-threatening disease, not only in India but in the whole world. So, there is a need fora reliable, accurate, and feasible system to diagnose such diseases in time for proper treatment. Machine Learning algorithms and techniques have been applied to various medical datasets to automate the analysis of large and complex data. Many researchers, in recent times, have been using several machine learning techniques to help the health care industry and the professionals in the diagnosis of heart-related diseases. Heart is the next major organ comparing to the brain which has more priority in the Human body. It pumps the blood and supplies it to all organs of the whole body. Prediction of occurrences of heart diseases in the medical field is significant work. Data analytics is useful for prediction from more information and it helps the medical center to predict various diseases. A huge amount of patient-related data is maintained on monthly basis. The stored data can be useful for the source of predicting the occurrence of future diseases. Some of the data mining and machine learning techniques are used to predict heart diseases, such as Artificial Neural Network (ANN), Random Forest,and Support Vector Machine (SVM).Prediction and diagnosingof heart disease become a challenging factor faced by doctors and hospitals both in India and abroad. To reduce the large scale of deaths from heart diseases, a quick and efficient detection technique is to be discovered. Data mining techniques and machine learning algorithms play a very important role in this area. The researchers accelerating their research works to develop software with thehelp of machine learning algorithms which can help doctors to decide both prediction and diagnosing of heart disease. The main objective of this research project is to predict the heart disease of a patient using machine learning algorithms.

Download Full-text

Heart Disease Prediction Using Machine Learning Algorithms

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit206421 ◽

2020 ◽

pp. 137-149

Author(s):

Abhay Agrahary

Keyword(s):

Machine Learning ◽

Data Mining ◽

Heart Disease ◽

Accurate Diagnosis ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Disease Prediction ◽

Huge Amount ◽

Learning Techniques ◽

Comparative Results

Heart disease is one of the most fatal problems in the whole world, which cannot be seen with a naked eye and comes instantly when its limitations are reached. Therefore, it needs accurate diagnosis at accurate time. Health care industry produced huge amount of data every day related to patients and diseases. However, this data is not used efficiently by the researchers and practitioners. Today healthcare industry is rich in data however poor in knowledge. There are various data mining and machine learning techniques and tools available to extract effective knowledge from databases and to use this knowledge for more accurate diagnosis and decision making. Increasing research on heart disease predicting systems, it become significant to summarize the completely incomplete research on it. The main objective of this research paper is to summarize the recent research with comparative results that has been done on heart disease prediction and also make analytical conclusions. From the study, it is observed Naive Bayes with Genetic algorithm; Decision Trees and Artificial Neural Networks techniques improve the accuracy of the heart disease prediction system in different scenarios. In this paper commonly used data mining and machine learning techniques and their complexities are summarized.

Download Full-text

Relevant Independent Variables on MOBA Video Games to Train Machine Learning Algorithms

10.24132/csrn.2021.3101.19 ◽

2021 ◽

Author(s):

Juan Guillermo López Guzmán ◽

Cesar Julio Bustacara Medina

Keyword(s):

Machine Learning ◽

Video Games ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Multidimensional Data ◽

Data Sets ◽

Network Architectures ◽

Independent Variables ◽

Learning Techniques ◽

Multidimensional Data Sets

Popularity of Multiplayer Online Battle Arena (MOBA) video games has grown considerably, its popularity as well as the complexity of their playability, have attracted the attention in recent years of researchers from various areas of knowledge and in particular how they have resorted to different machine learning techniques. The papers reviewed mainly look for patterns in multidimensional data sets. Furthermore, these previous researches do not present a way to select the independent variables (predictors) to train the models. For this reason, this paper proposes a list of variables based on the techniques used and the objectives of the research. It allows to provide a set of variables to find patterns applied in MOBA videogames. In order to get the mentioned list, the consulted works were grouped by the used machine learning techniques, ranging from rule-based systems to complex neural network architectures. Also, a grouping technique is applied based on the objective of each research proposed.

Download Full-text

EKMPRFG: Ensemble of KNN, Multilayer Perceptron and Random Forest using Grading for Android Malware Classification

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e5866.018520 ◽

2020 ◽

Vol 8 (5) ◽

pp. 3353-3360

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Standard Deviation ◽

Principal Component ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Data Sets ◽

Android Malware ◽

Android Malware Detection ◽

Significant Research

Android is the most popular Operating Systems with over 2.5 billion devices across the globe. The popularity of this OS has unfortunately made the devices and the services they enable, vulnerable to numerous security threats. As a result of this, a significant research is being done in the field of Android Malware Detection employing Machine Learning Algorithms. Our current work emphasizes on the possible use of Machine Learning techniques for the detection of malware on such android devices. The proposed EKMPRFG is applied for the classification of Android Malware after a preprocessing phase involving a hybrid Feature Selection model using proposed Standard Deviation of Standard Deviation of Ranks (SDSDR) and several other builtin Feature Selection algorithms such as Correlation based Feature Selection (CFS), Classifier SubsetEval, Consistency SubsetEval, and Filtered SubsetEval followed by Principal Component Analysis(PCA) for dimensionality reduction. The experimental results obtained on two data sets indicate that EKMPRFG outperforms the existing works in terms of Prediction Accuracy and Weighted F- Measure values.

Download Full-text

Prediction and Analysis of Student Performance in Secondary Education Based on Data Mining and Machine Learning Techniques

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit20653 ◽

2020 ◽

pp. 294-301

Author(s):

Meenal Joshi ◽

Shiv Kumar

Keyword(s):

Machine Learning ◽

Data Mining ◽

Secondary School ◽

Student Performance ◽

Learning Algorithms ◽

Research Work ◽

Educational Data Mining ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Student’S Performance

<p>According to modern era education is the key to achieve success in the future; it develops a human personality, thoughts, and social skills. The purpose of this research work is to focus on educational data mining (EDM) through machine learning algorithms. EDM means to discover hidden knowledge and pattern about student's performance. Machine learning can be useful to predict the learning outcomes of students. From last few years, several tools have been used to judge the student's performance from different points of view like the student's level, objectives, techniques, algorithms, and different methods. In this paper, predicting and analyzing student performance in secondary school is conducted using data mining techniques and machine learning algorithms such as Naive Bayes, Decision Tree algorithm J48, and Logistic Regression. For this the collection of dataset from "Secondary School" and then filtration is applying on desired values using WEKA, tool.</p>

Download Full-text

A Classification Process for Detection Lung Cancer at Early Stage using Machine Learning Techniques

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2020/222922020 ◽

2020 ◽

Vol 9 (2) ◽

pp. 2371-2376

Author(s):

Kavita Srivastava

Keyword(s):

Machine Learning ◽

Lung Cancer ◽

Early Stage ◽

Machine Learning Techniques ◽

Learning Techniques

Download Full-text

Efficient Machine Learning Techniques to Diagnose and Predict Alzheimer’s disease

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.c6508.029320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 3953-3960

Keyword(s):

Machine Learning ◽

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Early Diagnosis ◽

Image Data ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Mechanisms ◽

Learning Techniques

Recent research in computational engineering have evidenced the design and development numerous intelligent models to analyze medical data and derive inferences related to early diagnosis and prediction of disease severity. In this context, prediction and diagnosis of fatal neurodegenerative diseases that comes under the class of dementia from medical image data is considered as the challenging area of research for many researchers. Recently Alzheimer’s disease is considered as major category of dementia that affects major population. Despite of the development of numerous machine learning models for early diagnosis of Alzheimer’s disease, it is observed that there is a lot more scope of research. Addressing the same, this article presents a systematic literature review of machine learning techniques developed for early diagnosis of Alzheimer’s disease. Furthermore this article includes major categories of machine learning algorithms that include artificial neural networks, Support vector machines and Deep learning based ensemble models that helps the budding researchers to explore the scope of research in predicting Alzheimer’s disease. Implementation results depict the comparative analysis of state of art machine learning mechanisms.

Download Full-text