Component Selection in Software Engineering – Which Attributes Are the Most Important in the Decision Process?

Forced to pick, what do you choose?

Time to extend your system with a new component! Should you develop it yourself or outsource the development? Or integrate an open sourced component? Or purchase from another software vendor? Based on a $100 test with 157 industry practitioners, we show that cost is the most important factor in the decision.

This paper is a spin-off from one of the foundational studies of the Orion project – a large research endeavor on decision-making in component-based software engineering. One of the studies we conducted within the Orion project to understand state-of-practice decision-making in industry was a large survey. The survey was long and complex, and to make the resulting survey paper somewhat more focused, we decided to extract one of the questions to a separate publication: if we force respondents to balance various factors against each other when selecting components – what is the most important? An important reason for pursuing this statistics-heavy paper was that the first author Panagiota did a PhD related to analysis of $100 tests… More about that further down.

Our work was nominated for the Best Paper Award at the Euromicro Conference on Software Engineering and Advanced Applications (SEAA) 2018!

Component sourcing options

In this study, in line with a large part of the Orion research, we specifically targeted Component Sourcing Options (CSO). Suppose you are about to add or replace a component in an evolving system, should you 1) develop it yourself, 2) outsource the development, 3) integrate an open source component, or 4) purchase something commercial-of-the-shelf (COTS)? While a decision between the four alternatives is a simplification of reality, this is the primary research target in the Orion project – we have studied it from several perspectives.

In this paper, we explore how important various component attributes are in the decision process. As part of a larger web-based questionnaire, we wanted the respondents to prioritize between the 11 attributes below, or provide their own attribute. First we asked the respondents to select up to five of the attributes in the list. Then we asked the respondents to weigh the selected subset of attributes against each other – for this we used the $100 test. The findings in this paper is based on 157 responses from industry practitioners.

Attributes to consider when selecting components.

The hundred-Dollar test

The $100 test is an example of cumulative voting, a straightforward question design to request respondents to assign money to represent relative preferences. Each stakeholder gets $100 and decides freely how to distribute them among the alternatives, e.g., put everything on one single alternative or to distribute it all equally. The table below shows the descriptive statistics of the responses. It’s already clear that the respondents consider cost to be the main driver during component selection.

Descriptive statistics from the $100 question, including how many respondents selected the attribute and the total $ assigned.

Compositional data analysis and biplots

While you can see patterns in the descriptive statistics, we wanted to take the data analysis one step further. However, statistical analysis of $100 tests requires special treatment as typical assumptions required by inferential statistical are violated. We used the statistical framework Compositional Data Analysis (CoDA) to analyze the data. Furthermore, we visualized the results using biplots, a graphical tool for exploring trends and supporting insights.

The main biplot in the paper is presented on top of this page. The biplot shows individual responses as blue dots and the red “rays” represents the different attributes. The length of a ray depicts variance, i.e., the attribute “level of off-the-shelf fit”has high variance. The angles between rays approximate correlations between variables: 0° – positive, 90° – no correlation, 180° – negative correlation. We also transformed the data, using CoDA methods, to investigate the responses across some demographic characteristics: the 1) roles, 2) working experience, and 3) education of the respondents, 4) the maturity of the product, and the 5) size of the company. However, the analysis revealed no significant differences.

Cost dwarfs other attributes

Since we found no statistical differences, we rely on descriptive statistics and the biplots to draw conclusions. We see that most attributes have long rays, which means there is considerable dissimilarity between the respondents priorities – they assign the $100 rather differently. Still, we see in the descriptive statistics that cost is by far the most important individual attribute. Looking at the biplots, we see (among other things) that cost and access to documentation are negatively correlated, and that size and complexity is positively correlated.

Looking at the distribution of plots in the biplot, we see that there are clusters, i.e., groupings of practitioners that prioritize similarly exist. We found that respondents working with legal issues appear to prioritize rather differently – perhaps not so strange? We also report some other tendencies in the paper, e.g., experienced respondents are more concerned about code quality and support and respondents working on newly established products care about complexity. Still, cost reduction is a common focus in CSO decisions among all groups of respondents – but the focus is even stronger for large companies.

Implications for Research

  • We show an example of statistical analysis of a $100 test using CoDA and biplots.
  • Small organizations with newly established products are more likely to prioritize other attributes than simply cost – decision support might be more useful to them.

Implications for Practice

  • Cost reduction is prioritized the highest when organizations choose between component sourcing options, i.e., whether to develop internally or not.
  • Be aware that the focus on cost is likely to increase even more as the company grows and matures.
Panagiota Chatzipetrou, Emil Alégroth, Efi Papatheocharous, Markus Borg, Tony Gorschek, and Krzysztof Wnuk. Component Selection in Software Engineering - Which Attributes Are the Most Important in the Decision Process?, In Proc. of the 44th Euromicro Conference on Software Engineering and Advanced Applications, 2018. (link, preprint)

Abstract

Component-based software engineering is a common approach to develop and evolve contemporary software systems where different component sourcing options are available: 1) Software developed internally (in-house), 2) Software developed outsourced, 3) Commercial of the shelf software, and 4) Open Source Software. However, there is little available research on what attributes of a component are the most important ones when selecting new components. The object of the present study is to investigate what matters the most to industry practitioners during component selection. We conducted a cross-domain anonymous survey with industry practitioners involved in component selection. First, the practitioners selected the most important attributes from a list. Next, they prioritized their selection using the Hundred-Dollar ($100) test. We analyzed the results using Compositional Data Analysis. The descriptive results showed that Cost was clearly considered the most important attribute during the component selection. Other important attributes for the practitioners were: Support of the component, Longevity prediction, and Level of off-the-shelf fit to product. Next an exploratory analysis was conducted based on the practitioners' inherent characteristics. Nonparametric tests and biplots were used. It seems that smaller organizations and more immature products focus on different attributes than bigger organizations and mature products which focus more on Cost.