Predicting Build Outcomes In Continuous Integration Using Textual Analysis of Source Code Commits
Machine learning has been increasingly used to solve various software engineering tasks. One example of its usage is to predict the outcome of builds in continuous integration, where a classifier is built to predict whether new code commits will successfully compile. The aim of this study is to investigate the effectiveness of fifteen software metrics in building a classifier for build outcome prediction. Particularly, we implemented an experiment wherein we compared the effectiveness of a line-level and fourteen other traditional software metrics on 49,040 build records that belong to 117 Java projects. The results show that using the line-level metric for training produces a slightly better predictive quality result when it comes to identifying passing builds, compared to file-level metrics. Specifically, we achieved an average precision of 91% and 80% recall when using the line-level metric for training, compared to 90% precision and 76% recall for the next best traditional software metric. The effect size analysis showed that the difference between the token frequency and the next best traditional software metric is small (effect size =0.23), indicating that the probability of achieving higher predictive quality when using the token frequency metric is low. We conclude that using the token frequency for build prediction can be used to pinpoint issues in lines of code without compromising the predictive quality in identifying passing builds.
Fri 18 NovDisplayed time zone: Beijing, Chongqing, Hong Kong, Urumqi change
14:00 - 15:30 | |||
14:00 20mResearch paper | Improving the Performance of Code Vulnerability Prediction using Abstract Syntax Tree Information PROMISE Fahad Al Debeyan Lancaster University, Tracy Hall Lancaster University, David Bowes Lancaster University | ||
14:20 20mResearch paper | Feature sets in just-in-time defect prediction: An empirical evaluation PROMISE | ||
14:40 20mResearch paper | Predicting Build Outcomes In Continuous Integration Using Textual Analysis of Source Code Commits PROMISE Khaled Al-Sabbagh University of Gothenburg, Miroslaw Staron University of Gothenburg, Regina Hebig University of Gothenburg | ||
15:00 20mResearch paper | Identifying security-related requirements in regulatory documents based on cross-project classification PROMISE Mazen Mohamad Chalmers and University of Gothenburg, Jan-Philipp Steghöfer XITASO GmbH IT & Software Solutions, Alexander Åström Volvo GTT, Riccardo Scandariato Hamburg University of Technology |