Feature selection for high-dimensional class-imbalanced data sets using Support Vector Machines

2014 ◽  
Vol 286 ◽  
pp. 228-246 ◽  
Author(s):  
Sebastián Maldonado ◽  
Richard Weber ◽  
Fazel Famili
10.29007/h71z ◽  
2020 ◽  
Author(s):  
Waleed Almutairi ◽  
Ryszard Janicki

The paper deals with problems that imbalanced and overlapping datasets often en- counter. Performance indicators as accuracy, precision and recall of imbalanced data sets, both with and without overlapping, are discussed and compared with the same performance indicators of balanced datasets with overlapping. Three popular classification algorithms, namely, Decision Tree, KNN (k-Nearest Neighbors) and SVM (Support Vector Machines) classifiers are analyzed and compared.


2004 ◽  
Vol 13 (04) ◽  
pp. 791-800 ◽  
Author(s):  
HOLGER FRÖHLICH ◽  
OLIVIER CHAPELLE ◽  
BERNHARD SCHÖLKOPF

The problem of feature selection is a difficult combinatorial task in Machine Learning and of high practical relevance, e.g. in bioinformatics. Genetic Algorithms (GAs) offer a natural way to solve this problem. In this paper we present a special Genetic Algorithm, which especially takes into account the existing bounds on the generalization error for Support Vector Machines (SVMs). This new approach is compared to the traditional method of performing cross-validation and to other existing algorithms for feature selection.


Sign in / Sign up

Export Citation Format

Share Document