Comparing fine-grained source code changes and code churn for bug prediction

More than two decades ago, researchers started to mine the data stored in software repositories to help software developers in making informed decisions for developing and testing software systems. Bug prediction was one of the most promising and popular research directions that uses the data stored in software repositories to predict the bug-proneness or number of bugs in source files. On that topic and as part of Emanuel's PhD studies, we submitted a paper with the title Comparing fine-grained source code changes and code churn for bug prediction [8] to the 8th Working Conference on Mining Software Engineering, held 2011 in beautiful Honolulu, Hawaii. Ten years later, it got selected as one of the finalists to receive the MSR 2021 Most Influential Paper Award. In the following, we provide a retrospective on our work, describing the road to publishing this paper, its impact in the field of bug prediction, and the road ahead.

Download Full-text

Augmenting commit classification by using fine-grained source code changes and a pre-trained deep neural language model

Information and Software Technology ◽

10.1016/j.infsof.2021.106566 ◽

2021 ◽

Vol 135 ◽

pp. 106566

Author(s):

Lobna Ghadhab ◽

Ilyes Jenhani ◽

Mohamed Wiem Mkaouer ◽

Montassar Ben Messaoud

Keyword(s):

Language Model ◽

Source Code ◽

Fine Grained ◽

Code Changes ◽

Source Code Changes

Download Full-text

Mining source code changes from software repositories

2011 7th Central and Eastern European Software Engineering Conference (CEE-SECR) ◽

10.1109/cee-secr.2011.6188468 ◽

2011 ◽

Cited By ~ 3

Author(s):

Crt Gerlec ◽

Andrej Krajnc ◽

Marjan Hericko ◽

Jan Boznik

Keyword(s):

Source Code ◽

Software Repositories ◽

Code Changes ◽

Source Code Changes

Download Full-text

Analyzing the Impact of Antipatterns on Change-Proneness Using Fine-Grained Source Code Changes

2012 19th Working Conference on Reverse Engineering ◽

10.1109/wcre.2012.53 ◽

2012 ◽

Cited By ~ 24

Author(s):

Daniele Romano ◽

Paulius Raila ◽

Martin Pinzger ◽

Foutse Khomh

Keyword(s):

Source Code ◽

Fine Grained ◽

Change Proneness ◽

Code Changes ◽

The Impact ◽

Source Code Changes

Download Full-text

Using Topic Model to Suggest Fine-Grained Source Code Changes

2016 IEEE International Conference on Software Maintenance and Evolution (ICSME) ◽

10.1109/icsme.2016.40 ◽

2016 ◽

Cited By ~ 1

Author(s):

Hoan Anh Nguyen ◽

Anh Tuan Nguyen ◽

Tien N. Nguyen

Keyword(s):

Topic Model ◽

Source Code ◽

Fine Grained ◽

Code Changes ◽

Source Code Changes

Download Full-text

Proceeding of the 8th working conference on Mining software repositories - MSR '11 ◽

10.1145/1985441.1985456 ◽

2011 ◽

Cited By ~ 47

Author(s):

Emanuel Giger ◽

Martin Pinzger ◽

Harald C. Gall

Keyword(s):

Source Code ◽

Fine Grained ◽

Code Changes ◽

Source Code Changes

Download Full-text

Mapping Software Design Changes to Source Code Changes

Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD 2007) ◽

10.1109/snpd.2007.293 ◽

2007 ◽

Author(s):

Xiangchen Tan ◽

Tie Feng ◽

Jiachen Zhang

Keyword(s):

Software Design ◽

Source Code ◽

Mapping Software ◽

Design Changes ◽

Code Changes ◽

Source Code Changes

Download Full-text

A Novel Effort Measure Method for Effort-Aware Just-in-Time Software Defect Prediction

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194021500364 ◽

2021 ◽

Vol 31 (08) ◽

pp. 1145-1169

Author(s):

Liqiong Chen ◽

Shilong Song ◽

Can Wang

Keyword(s):

Prediction Model ◽

Defect Prediction ◽

Software Systems ◽

Support Vector ◽

Just In Time ◽

Software Defect Prediction ◽

Fine Grained ◽

Software Defect ◽

Code Changes ◽

The Cost

Just-in-time software defect prediction (JIT-SDP) is a fine-grained software defect prediction technology, which aims to identify the defective code changes in software systems. Effort-aware software defect prediction is a software defect prediction technology that takes into consideration the cost of code inspection, which can find more defective code changes in limited test resources. The traditional effort-aware defect prediction model mainly measures the effort based on the number of lines of code (LOC) and rarely considers additional factors. This paper proposes a novel effort measure method called Multi-Metric Joint Calculation (MMJC). When measuring the effort, MMJC takes into account not only LOC, but also the distribution of modified code across different files (Entropy), the number of developers that changed the files (NDEV) and the developer experience (EXP). In the simulation experiment, MMJC is combined with Linear Regression, Decision Tree, Random Forest, LightGBM, Support Vector Machine and Neural Network, respectively, to build the software defect prediction model. Several comparative experiments are conducted between the models based on MMJC and baseline models. The results show that indicators ACC and [Formula: see text] of the models based on MMJC are improved by 35.3% and 15.9% on average in the three verification scenarios, respectively, compared with the baseline models.

Download Full-text