Author(s):  
Kehan Gao ◽  
Taghi M. Khoshgoftaar

In the process of software defect prediction, a classification model is first built using software metrics and fault data gathered from a past software development project, then that model is applied to data in a similar project or a new release of the same project to predict new program modules as either fault-prone (fp) or not-fault-prone (nfp). The benefit of such a model is to facilitate the optimal use of limited financial and human resources for software testing and inspection. The predictive power of a classification model constructed from a given data set is affected by many factors. In this paper, we are more interested in two problems that often arise in software measurement data: high dimensionality and unequal example set size of the two types of modules (e.g., many more nfp modules than fp modules found in a data set). These directly result in learning time extension and a decline in predictive performance of classification models. We consider using data sampling followed by feature selection (FS) to deal with these problems. Six data sampling strategies (which are made up of three sampling techniques, each consisting of two post-sampling proportion ratios) and six commonly used feature ranking approaches are employed in this study. We evaluate the FS techniques by means of: (1) a general method, i.e., assessing the classification performance after the training data is modified, and (2) studying the stability of a FS method, specifically with the goal of understanding the effect of data sampling techniques on the stability of FS when using the sampled data. The experiments were performed on nine data sets from a real-world software project. The results demonstrate that the FS techniques that most enhance the models' classification performance do not also show the best stability, and vice versa. In addition, the classification performance is more affected by the sampling techniques themselves rather than by the post-sampling proportions, whereas this is opposite for the stability.


2014 ◽  
Vol 556-562 ◽  
pp. 2783-2786
Author(s):  
Qing Hai Meng

For GPS measurement signal in aircraft experiment is often affected by transmission environment, and interfered with impulsive noise, hereby a SVD combined with wavelet neural network to detect and eliminate the impulsive noise method was proposed. The received GPS data is decomposed by SVD, and the decomposed component is acted as the input of wavelet neural network. Letts criterion is adopted to detect the impulsive noise according to the output residue error of the wavelet neural network. For the detection of the interference points of impulse noise, it can use wavelet network output to replace the measured value, so as to eliminate impulsive noise.


2008 ◽  
Vol 16 (4) ◽  
pp. 563-600 ◽  
Author(s):  
Taghi M. Khoshgoftaar ◽  
Jason Van Hulse

Sign in / Sign up

Export Citation Format

Share Document