scholarly journals The Application Of Machine Learning In Test Case Prioritization - A Review

Author(s):  
Elinda Kajo Mece ◽  
Kleona Binjaku ◽  
Hakik Paci

Regression testing is very important but also a very costly and time-consuming activity that ensures the developers that changes in the application will not bring new errors. Retest all, selection of test cases and prioritization of test cases (TCP)  approaches are used to enhance the efficiency and effectiveness in regression testing. While test case selection techniques decrease testing time and cost, it can exclude some critical test cases that can detect the faults. On the other hand, test case prioritization considers all test cases and execute them until resources are exhausted or all test cases are executed, while always focusing on the most important ones. Over the years, machine learning has found wide usage in solving different problems in software engineering. Software development and maintenance problems can be defined as learning problems and machine learning techniques have shown to be very effective in solving these problems. In the range of application of machine learning, machine learning techniques have also found usage in solving the test case prioritization problem. In this paper, we investigate the application of machine learning techniques in test case prioritization. We survey some of the most recent studies made in this field and provide information like techniques of machine learning used in TCP process, metrics used to measure the effectiveness of the proposed methods, data used to define the priority of test cases and some advantages or limitations of application of machine learning in TCP.

2020 ◽  
Author(s):  
Andreea Vescan ◽  
Camelia-M Pintea ◽  
Petrică C Pop

Abstract Regression testing is applied whenever a code changes, ensuring that the modifications fixed the fault and no other faults are introduced. Due to a large number of test cases to be run, test case prioritization is one of the strategies that allows to run the test cases with the highest fault rate first. The aim of the paper is to present an optimized test case prioritization method inspired by ant colony optimization, test case prioritization–ANT. The criteria used by the optimization algorithm are the number of faults not covered yet by the selected test cases and the sum of severity of the faults. The cost, i.e. time execution, for test cases is considered in the computation of the pheromone deposited on the graph’s edges. The average percentage of fault detected metric, as best selection criterion, is used to uncover maximum faults with the highest severity, and reducing the regression testing time. Several experiments are considered, detailed and discussed, comparing various algorithm parameter’s alternatives. A benchmark project is also used to validate the proposed approach. The obtained results are encouraging, being a cornerstone for new perspectives to be considered.


2014 ◽  
Vol 2014 ◽  
pp. 1-9 ◽  
Author(s):  
Ali M. Alakeel

Program assertions have been recognized as a supporting tool during software development, testing, and maintenance. Therefore, software developers place assertions within their code in positions that are considered to be error prone or that have the potential to lead to a software crash or failure. Similar to any other software, programs with assertions must be maintained. Depending on the type of modification applied to the modified program, assertions also might have to undergo some modifications. New assertions may also be introduced in the new version of the program, while some assertions can be kept the same. This paper presents a novel approach for test case prioritization during regression testing of programs that have assertions using fuzzy logic. The main objective of this approach is to prioritize the test cases according to their estimated potential in violating a given program assertion. To develop the proposed approach, we utilize fuzzy logic techniques to estimate the effectiveness of a given test case in violating an assertion based on the history of the test cases in previous testing operations. We have conducted a case study in which the proposed approach is applied to various programs, and the results are promising compared to untreated and randomly ordered test cases.


Test case prioritization (TCP) is a software testing technique that finds an ideal ordering of test cases for regression testing, so that testers can obtain the maximum benefit of their test suite, even if the testing process is stop at some arbitrary point. The recent trend of software development uses OO paradigm. This paper proposed a cost-cognizant TCP approach for object-oriented software that uses path-based integration testing. Path-based integration testing will identify the possible execution path and extract these paths from the Java System Dependence Graph (JSDG) model of the source code using forward slicing technique. Afterward evolutionary algorithm (EA) was employed to prioritize test cases based on the severity detection per unit cost for each of the dependent faults. The proposed technique was known as Evolutionary Cost-Cognizant Regression Test Case Prioritization (ECRTP) and being implemented as regression testing approach for experiment.


Regression testing is performed to make conformity that any changes in software program do not disturb the existing characteristics of the software. As the software improves, the test case tends to grow in size that makes it very costly to be executed, and thus the test cases are needed to be prioritized to select the effective test cases for software testing. In this paper, a test case prioritization technique in regression testing is proposed using a novel optimization algorithm known as Taylor series-based Jaya Optimization Algorithm (Taylor-JOA), which is the integration of Taylor series in Jaya Optimization Algorithm (JOA). The optimal test cases are selected based on the fitness function, modelled depending on the constraints, namely fault detection and branch coverage. The experimentation of the proposed Taylor-JOA is performed with the consideration of the evaluation metrics, namely Average Percentage of Fault Detected (APFD) and the Average Percentage of Branch Coverage (APBC). The APFD and the APBC of the proposed Taylor-JOA is 0.995, and 0.9917, respectively, which is high as compared to the existing methods that show the effectiveness of the proposed Taylor-JOA in the task of test case prioritization


Author(s):  
P. Priakanth ◽  
S. Gopikrishnan

The idea of an intelligent, independent learning machine has fascinated humans for decades. The philosophy behind machine learning is to automate the creation of analytical models in order to enable algorithms to learn continuously with the help of available data. Since IoT will be among the major sources of new data, data science will make a great contribution to make IoT applications more intelligent. Machine learning can be applied in cases where the desired outcome is known (guided learning) or the data is not known beforehand (unguided learning) or the learning is the result of interaction between a model and the environment (reinforcement learning). This chapter answers the questions: How could machine learning algorithms be applied to IoT smart data? What is the taxonomy of machine learning algorithms that can be adopted in IoT? And what are IoT data characteristics in real-world which requires data analytics?


Author(s):  
P. Priakanth ◽  
S. Gopikrishnan

The idea of an intelligent, independent learning machine has fascinated humans for decades. The philosophy behind machine learning is to automate the creation of analytical models in order to enable algorithms to learn continuously with the help of available data. Since IoT will be among the major sources of new data, data science will make a great contribution to make IoT applications more intelligent. Machine learning can be applied in cases where the desired outcome is known (guided learning) or the data is not known beforehand (unguided learning) or the learning is the result of interaction between a model and the environment (reinforcement learning). This chapter answers the questions: How could machine learning algorithms be applied to IoT smart data? What is the taxonomy of machine learning algorithms that can be adopted in IoT? And what are IoT data characteristics in real-world which requires data analytics?


2013 ◽  
Vol 10 (1) ◽  
pp. 73-102 ◽  
Author(s):  
Lijun Mei ◽  
Yan Cai ◽  
Changjiang Jia ◽  
Bo Jiang ◽  
W.K. Chan

Many web services not only communicate through XML-based messages, but also may dynamically modify their behaviors by applying different interpretations on XML messages through updating the associated XML Schemas or XML-based interface specifications. Such artifacts are usually complex, allowing XML-based messages conforming to these specifications structurally complex. Testing should cost-effectively cover all scenarios. Test case prioritization is a dimension of regression testing that assures a program from unintended modifications by reordering the test cases within a test suite. However, many existing test case prioritization techniques for regression testing treat test cases of different complexity generically. In this paper, the authors exploit the insights on the structural similarity of XML-based artifacts between test cases in both static and dynamic dimensions, and propose a family of test case prioritization techniques that selects pairs of test case without replacement in turn. To the best of their knowledge, it is the first test case prioritization proposal that selects test case pairs for prioritization. The authors validate their techniques by a suite of benchmarks. The empirical results show that when incorporating all dimensions, some members of our technique family can be more effective than conventional coverage-based techniques.


Mathematics ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. 1857
Author(s):  
A. D. Shrivathsan ◽  
R. Krishankumar ◽  
Arunodaya Raj Mishra ◽  
K. S. Ravichandran ◽  
Samarjit Kar ◽  
...  

This paper focuses on an exciting and essential problem in software companies. The software life cycle includes testing software, which is often time-consuming, and is a critical phase in the software development process. To reduce time spent on testing and to maintain software quality, the idea of a systematic selection of test cases is needed. Attracted by the claim, researchers presented test case prioritization (TCP) by applying the concepts of multi-criteria decision-making (MCDM). However, the literature on TCP suffers from the following issues: (i) difficulty in properly handling uncertainty; (ii) systematic evaluation of criteria by understanding the hesitation of experts; and (iii) rational prioritization of test cases by considering the nature of criteria. Motivated by these issues, an integrated approach is put forward that could circumvent the problem in this paper. The main aim of this research is to develop a decision model with integrated methods for TCP. The core importance of the proposed model is to (i) provide a systematic/methodical decision on TCP with a reduction in testing time and cost; (ii) help software personnel choose an apt test case from the suite for testing software; (iii) reduce human bias by mitigating intervention of personnel in the decision process. To this end, probabilistic linguistic information (PLI) is adopted as the preference structure that could flexibly handle uncertainty by associating occurrence probability to each linguistic term. Furthermore, an attitude-based entropy measure is presented for criteria weight calculation, and finally, the EDAS ranking method is extended to PLI for TCP. An empirical study of TCP in a software company is presented to certify the integrated approach’s effectiveness. The strengths and weaknesses of the introduced approach are conferred by comparing it with the relevant methods.


2016 ◽  
Vol 2016 ◽  
pp. 1-19 ◽  
Author(s):  
Rongcun Wang ◽  
Shujuan Jiang ◽  
Deng Chen ◽  
Yanmei Zhang

Similarity-based test case prioritization algorithms have been applied to regression testing. The common characteristic of these algorithms is to reschedule the execution order of test cases according to the distances between pair-wise test cases. The distance information can be calculated by different similarity measures. Since the topologies vary with similarity measures, the distances between pair-wise test cases calculated by different similarity measures are different. Similarity measures could significantly influence the effectiveness of test case prioritization. Therefore, we empirically evaluate the effects of six similarity measures on two similarity-based test case prioritization algorithms. The obtained results are statistically analyzed to recommend the best combination of similarity-based prioritization algorithms and similarity measures. The experimental results, confirmed by a statistical analysis, indicate that Euclidean distance is more efficient in finding defects than other similarity measures. The combination of the global similarity-based prioritization algorithm and Euclidean distance could be a better choice. It generates not only higher fault detection effectiveness but also smaller standard deviation. The goal of this study is to provide practical guides for picking the appropriate combination of similarity-based test case prioritization techniques and similarity measures.


These days’ data gathered is unstructured. It is becoming very hard to have labelled data gathered, due to the volume of the data being generated every second. It is almost impossible to train a model on the unstructured/unlabelled data. The unlabelled data will be divided into groups using the ML techniques and CNN/Deep learning/Machine Learning techniques will be trained using the grouped data generated. The model will be enhanced over time by the feedback given by the users and with addition of new data as well. Existing models can be trained over labelled data only. Without labelled data models cannot be used for prediction and reinforcement learning. In this approach though the data is unlabelled if a feature column is specified we will be able to train the model with the help of SME. This will be helpful in many areas of classification and prediction of the trends and patterns. Machine learning, Deep learning techniques (Supervised) will be used to implement the data. Tools used will be Python, PyTorch and TensorFlow. Input can be any data (Audio/Video/pictographic/text). Labelled data and a model file which could be used for further predictions, and which will be improved over feedback.


Sign in / Sign up

Export Citation Format

Share Document