Markus Borg, Petter Gulin, and Linus Olofsson
Priority of reported issues is a poor predictor of future code change. Suspicions
empirically confirmed: submitter metadata most important.
Issue management in market-driven software projects is constantly under time pressure. A limited set of developers must share their time between developing features for the next release and resolving reported issues. Project managers need to and the appropriate balance between a high quality product and fast time to market.
We study a telecom company in Sweden developing embedded systems for a consumer market. The project managers report that developers resolve approximately 10% of the issues reported during a project. Consequently, it is critical to properly prioritize the issues to receive the best possible return on investment, and above all to remove all bugs that might impact the market’s reception of the product.
We use machine learning to investigate what features of an issue report are the best predictors of changes to production code during its corresponding resolution. After removing all features jeopardizing the confidentiality of individual engineers, the issue reports are characterized by 19 features (apart from text).
We extract 80,000 issue reports, an equal mix of positive and negative examples, and train a Bayesian Network classifier, obtaining 73% classification accuracy. Moreover, it reveals that the feature with the highest predictive value is from which physical site the issue was submitted. The general priority feature however, is only ranked 17 out of 19, whereas the submitting team is ranked 12. Our findings confirm a suspicion in the company: the priority set by the issue submitter is indeed a poor predictor of a future code change.