Just-in-time defect prediction for software hunks

  • Xiaoyan Zhu
  • , Chenyu Yan
  • , E. James Whitehead
  • , Binbin Niu
  • , Lei Zhu
  • , Long Pan

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Just-in-time defect prediction can remind software developers and managers to verify and fix bugs at the moment they appeared, thus improving the effectiveness and validity of bug fixing. Existing studies mainly focus on just-in-time prediction for software files (JIT-F). JIT-F is a binary classification problem, which classifies (hence predicts) a file change as buggy or clean. This article provides a detailed analysis of just-in-time defect prediction for software hunks (JIT-H), which predicts bugs at a finer level of granularity, and hence further improves the efficiency of bug fixing. Classification is performed using the ensemble technique of bagging—aggregated combinations of random under sampling plus multiple classifiers (J48 and Random Forest). An empirical study with 10 open source projects was conducted to validate the effectiveness of JIT-H. Experimental results show that JIT-H is effective at predicting defects in software hunk changes. Compared with JIT-F, JIT-H is more cost effective. Additionally, analysis on the change features indicates that Text Vector features and hunk change level features are of more importance than features in other groups and levels.

Original languageEnglish
Pages (from-to)130-153
Number of pages24
JournalSoftware - Practice and Experience
Volume52
Issue number1
DOIs
StatePublished - Jan 2022

Keywords

  • cost effectiveness
  • hunk change
  • imbalanced learning
  • JIT defect prediction

Fingerprint

Dive into the research topics of 'Just-in-time defect prediction for software hunks'. Together they form a unique fingerprint.

Cite this