Dealing With Redundant Features and Inconsistent Training Data in Electronic Nose: A Rough Set Based Approach

2014 ◽  
Vol 14 (3) ◽  
pp. 758-767 ◽  
Author(s):  
Anil Kumar Bag ◽  
Bipan Tudu ◽  
Nabarun Bhattacharyya ◽  
Rajib Bandyopadhyay
Author(s):  
Y. Ishii ◽  
K. Iwao ◽  
T. Kinoshita

<p><strong>Abstract.</strong> This paper aims to clarify the meaning of the membership which is produced as by-products of land cover classification by Grade-added rough set (GRS). A new land cover classification method by using GRS was developed. The classification scheme of GRS which calculates membership (degree of grade) for each class is similar to those of MLC and SVM. But there are two things that are not clear. One is a meaning of the membership of GRS and the other is a reason why the larger membership in GRS employed works well. In this study, aerial images were used to visualize the relation of membership between GRS and existing classifiers, MLC and SVM. Furthermore, a model experiment in two-dimensional feature space was conducted. From these experiments, it was found that the meaning of degree of grade is a distance from a nearest training data of other class. That is, the meaning of membership of GRS is similar to that of SVM, because SVM also calculates a distance from boundary line which is determined by support vectors, while the meaning of membership of MLC is a distance from a centroid of own class. Also it was found that what the distance from the closest other class is given as the degree of grade implies that the higher the grade, the higher the certainty. In this research we could clarify some of the features of land cover classification using GRS.</p>


Author(s):  
TAGHI M. KHOSHGOFTAAR ◽  
LOFTON A. BULLARD ◽  
KEHAN GAO

Finding techniques to reduce software developmental effort and produce highly reliable software is an extremely vital goal for software developers. One method that has proven quite useful is the application of software metrics-based classification models. Classification models can be constructed to identify faulty components in a software system with high accuracy. Significant research has been dedicated towards developing methods for improving the quality of software metrics-based classification models. It has been shown in several studies that the accuracy of these models improves when irrelevant attributes are identified and eliminated from the training data set. This study presents a rough set theory approach, based on classical set theory, for identifying and eliminating irrelevant attributes from a training data set. Rough set theory is used to find small groups of attributes, determined by the relationships that exist between the objects in a data set, with comparable discernibility as larger sets of attributes. This allows for the development of simpler classification models that are easy for analyst to understand and explain to others. We built case-based reasoning models in order to evaluate their classification performance on the smaller subsets of attributes selected using rough set theory. The empirical studies demonstrated that by applying a rough set approach to find small subsets of attributes we can build case-based reasoning models with an accuracy comparable to, and in some cases better than, a case-based reasoning model built with a complete set of attributes.


2011 ◽  
Vol 58-60 ◽  
pp. 974-977 ◽  
Author(s):  
Jun Rong Yan ◽  
Yong Min ◽  
Xia Cui ◽  
Yan Huang

Artificial neural network was one of the most important methods in intelligent fault diagnosis because it has the performance of nonlinear pattern classification and the capacity of self-learning and self-organization, but it can not judge redundancy and usefulness of information. Rough set can reduce the knowledge of information system and dislodge redundant information. In this paper, fault data of rolling bearing was reduced by the greedy algorithm of rough set. Training data and test data of BP neural network had been reduced by rough set. By comparison of two test result about simply data and original data, it was indicated that resolving power was unchanged and database was simply.


Fuzzy Systems ◽  
2017 ◽  
pp. 1367-1384
Author(s):  
Noor Akhmad Setiawan

The objective of this research is to develop an evidence based fuzzy decision support system for the diagnosis of coronary artery disease. The development of decision support system is implemented based on three processing stages: rule generation, rule selection and rule fuzzification. Rough Set Theory (RST) is used to generate the classification rules from training data set. The training data are obtained from University California Irvine (UCI) data repository. Rule selection is conducted by transforming the rules into a decision table based on unseen data set. Furthermore, RST attributes reduction is proposed and applied to select the most important rules. The selected rules are transformed into fuzzy rules based on discretization cuts of numerical input attributes and simple triangular and trapezoidal membership functions. Fuzzy rules weighing is also proposed and applied based on rules support on the training data. The system is validated using UCI heart disease data sets collected from the U.S., Switzerland and Hungary and data set from Ipoh Specialist Hospital Malaysia. The system is verified by three cardiologists. The results show that the system is able to give the approximate possibility of coronary artery blocking.


2003 ◽  
Vol 15 (4) ◽  
pp. 369-376 ◽  
Author(s):  
Bancha Charumporn ◽  
◽  
Michifumi Yoshioka ◽  
Toru Fujinaka ◽  
Sigeru Omatu

An electronic nose developed from metal oxide gas sensors is applied to test smoke of three general household burning materials under different environments. Generally training data is randomly selected for a layered neural network with error back-propagation (BP). Randomized training data always contain redundant data that lengthen training time without improving classification performance. This paper proposes an effective method to select training data based on a similarity index (SI). The SI ensures that only the most valuable training data is included in the training data set. The proposed method is applied to remove redundant data from the training data set before being fed to the layered neural network based on BP. Results verified high classification performance by using a small number of training data from proposed method.


2011 ◽  
Vol 11 (11) ◽  
pp. 3001-3008 ◽  
Author(s):  
Anil Kumar Bag ◽  
Bipan Tudu ◽  
Jayashri Roy ◽  
Nabarun Bhattacharyya ◽  
Rajib Bandyopadhyay

2011 ◽  
Author(s):  
Rajib Bandyopadhyay ◽  
Anil Kumar Bag ◽  
Bipan Tudu ◽  
Nabarun Bhattacharyya ◽  
Perena Gouma

Sign in / Sign up

Export Citation Format

Share Document