scholarly journals Benchmarking relief-based feature selection methods for bioinformatics data mining

2018 ◽  
Vol 85 ◽  
pp. 168-188 ◽  
Author(s):  
Ryan J. Urbanowicz ◽  
Randal S. Olson ◽  
Peter Schmitt ◽  
Melissa Meeker ◽  
Jason H. Moore
Data Mining ◽  
2013 ◽  
pp. 92-106
Author(s):  
Harleen Kaur ◽  
Ritu Chauhan ◽  
M. Alam

With the continuous availability of massive experimental medical data has given impetus to a large effort in developing mathematical, statistical and computational intelligent techniques to infer models from medical databases. Feature selection has been an active research area in pattern recognition, statistics, and data mining communities. However, there have been relatively few studies on preprocessing data used as input for data mining systems in medical data. In this chapter, the authors focus on several feature selection methods as to their effectiveness in preprocessing input medical data. They evaluate several feature selection algorithms such as Mutual Information Feature Selection (MIFS), Fast Correlation-Based Filter (FCBF) and Stepwise Discriminant Analysis (STEPDISC) with machine learning algorithm naive Bayesian and Linear Discriminant analysis techniques. The experimental analysis of feature selection technique in medical databases has enable the authors to find small number of informative features leading to potential improvement in medical diagnosis by reducing the size of data set, eliminating irrelevant features, and decreasing the processing time.


Author(s):  
Harleen Kaur ◽  
Ritu Chauhan ◽  
M. Alam

With the continuous availability of massive experimental medical data has given impetus to a large effort in developing mathematical, statistical and computational intelligent techniques to infer models from medical databases. Feature selection has been an active research area in pattern recognition, statistics, and data mining communities. However, there have been relatively few studies on preprocessing data used as input for data mining systems in medical data. In this chapter, the authors focus on several feature selection methods as to their effectiveness in preprocessing input medical data. They evaluate several feature selection algorithms such as Mutual Information Feature Selection (MIFS), Fast Correlation-Based Filter (FCBF) and Stepwise Discriminant Analysis (STEPDISC) with machine learning algorithm naive Bayesian and Linear Discriminant analysis techniques. The experimental analysis of feature selection technique in medical databases has enable the authors to find small number of informative features leading to potential improvement in medical diagnosis by reducing the size of data set, eliminating irrelevant features, and decreasing the processing time.


2016 ◽  
Vol 71 ◽  
pp. 76-85 ◽  
Author(s):  
Farideh Bagherzadeh-Khiabani ◽  
Azra Ramezankhani ◽  
Fereidoun Azizi ◽  
Farzad Hadaegh ◽  
Ewout W. Steyerberg ◽  
...  

2010 ◽  
Vol 18 (03) ◽  
pp. 605-619
Author(s):  
JIANXIN CHEN ◽  
ZHENHUA JIA ◽  
XIANGCHUN WU ◽  
GUOQIANG YUAN ◽  
CONG WEI ◽  
...  

Hyperlipidemia (HL) and unstable angina (UA) are two sequential diseases that cause more and more morbidity and mortality world-wide. Biomarkers selection in the level of physical and chemical specifications (PCS) plays a key role in understanding the pathology of both diseases. Neuro-Endocrine-Immune (NEI) system is a preferable pathway to investigate the interaction network of related PCS in the context of HL and UA. Data mining approaches are a kind of advanced statistical methods to unravel the "secret" of interaction network of PCS in both diseases. Feature selection methods are a branch of data mining approaches to select informative subset of PCS as biomarkers to distinguish a disease from healthy control cohort with high classification accuracy. In this paper, we firstly use three feature selection methods combined with decision tree classification algorithm to select several biomarkers from NEI network. The results show that SVM based decision tree is best fit to select biomarkers for both diseases. Furthermore, we use the theory from Traditional Chinese Medicine (TCM) to divide HL and UA patients into two subgroups. Based on this, we propose a novel feature selection method to distinguish the two subgroups. We combine variance analysis with classification method to select three to four biomarkers for two subgroups in the context of HL and UA respectively, which means that NEI specifications behave differently between two subgroups. According to basic theory of TCM, variant subgroups defined by TCM need to be treated differently. It means that patients with the same disease may be treated in a personalized way. The research efforts in the paper not only to provide a better avenue to understand the nature of diseases, but also to pave a basis to treat two diseases in a personalized way.


Author(s):  
Raghavendra S ◽  
Santosh Kumar J

<p>Data mining is nothing but the process of viewing data in different angle and compiling it into appropriate information. Recent improvements in the area of data mining and machine learning have empowered the research in biomedical field to improve the condition of general health care. Since the wrong classification may lead to poor prediction, there is a need to perform the better classification which further improves the prediction rate of the medical datasets. When medical data mining is applied on the medical datasets the important and difficult challenges are the classification and prediction. In this proposed work we evaluate the PIMA Indian Diabtes data set of UCI repository using machine learning algorithm like Random Forest along with feature selection methods such as forward selection and backward elimination based on entropy evaluation method using percentage split as test option. The experiment was conducted using R studio platform and we achieved classification accuracy of 84.1%. From results we can say that Random Forest predicts diabetes better than other techniques with less number of attributes so that one can avoid least important test for identifying diabetes.</p>


Sign in / Sign up

Export Citation Format

Share Document