How CodeScene Differs From Traditional Code Analysis Tools
The main difference between CodeScene’s behavioral code analysis and traditional code scanning techniques is that static analysis works on a snapshot of the codebase while CodeScene considers the temporal dimension and evolution of the whole system. This makes it possible for CodeScene to prioritize technical debt and code quality issues based on how the organization actually works with the code. Hence, we can limit the results to information that is relevant, actionable, and translates directly into business value.
CodeScene also goes beyond code as we consider the organization and people side of the system. This gives you valuable information that is invisible in the source code itself, such as measures of team autonomy and off-boarding risks.
This article explores how this is possible, and we also look at some independent academic research to find out how well it works in practice; with code quality it’s your time and money on the line, so lets invest wisely in our tooling.
A Behavioral Code Analysis Prioritizes Technical Debt
A traditional static code analysis tool focuses on a snapshot of the code as it looks right now. Such tooling is valuable in that it might find code that is overly complex, has heavy dependencies on other parts, or contain error prone constructs. It’s genuinely useful and I use static code analysis myself – it’s a valuable practice.
However, a static analysis will never be able to tell you if that excess code complexity actually matters – just because a piece of code is complex doesn’t mean its a problem. This is where CodeScene’s behavioral code analysis fills an important gap.
CodeScene identifies and prioritizes technical debt based on how the organization works with the code. That is, we look at patterns in how the developers interact with the codebase, and we detect in which direction each piece of code evolves – does it get better or worse? The reason we’re able to do that is because we mine and analyze behavioral data as recorded in version-control systems and project management tools:
The analysis is completely automated, and CodeScene is able to prioritize a small part of your codebase – typically 2-4% – that identifies the most likely return on any code quality investments. As such, CodeScene differs as its goal isn’t to give you detailed information on the whole codebase. Instead, a behavioral code analysis prioritizes any technical debt or code quality issues, and makes the resulting information actionable:
So How Well Does It Work?
First of all, we have to remember that CodeScene was created to fill a gap; whereas static analysis tools are good at catching coding mistakes and provide detailed feedback to programmers, the same techniques don’t work particularly well for prioritizing technical debt. This is reported in a recent research paper from the University of Ottawa which concludes that:
in reality, acting upon all the TD instances is not worthy(Parthiban, D.G. Examination of tools for managing different dimensions of Technical Debt, 2019).
There’s simply too much technical debt, and the business value from fixing it isn’t clear. But the paper continues:
There are tools like CodeScene which helps in prioritizing the refactoring targets. It prioritizes TD instances based on their technical debt interest rate, which is exactly our claim above.
So CodeScene works well in practice for prioritization. But what about the impact of its reported issues? Additional research, this time from the University of Victoria’s code quality study, compared CodeScene to a market-leading static analysis tool and verified the reports by human inspection:
- Problems detected by the static analysis tool
were likely small issues which would result in little reward if fixed.
- The study also claims that
Next, we ran CodeScene on Bokeh [a codebase], which lead to more significant results..
- The case study concludes that
We found CodeScene to be more useful [..] as it provided us with a higher level view of problems and potential issues.
- Using CodeScene also
shed a light on issues that were not apparent while previously examining the source code.
Both of these studies also looked at people-factors. This is a topic where behavioral code analysis really shines, so let’s look at how to measure organizational factors.
Measure Team Work and Organizational Factors
The code itself is only one component of a software system. As soon as an organization grows beyond a handful of people, social aspects like coordination, communication, and motivation issues increase in importance. Unfortunately these, well, softer aspects of software development are invisible in our code; if you pick up a piece of code from your system there’s no way of telling if it’s been written by a single developer or if that code is a coordination bottleneck for five development teams. That is, we miss an important piece of information: the people side of code.
A behavioral code analysis tool like CodeScene helps you fill in the blanks. Since behavioral code analysis builds on social data – CodeScene knows exactly which programmer that wrote each piece of code – it’s possible to build up knowledge maps of a codebase and to measure the coordination needs by detecting inter-team dependencies. This is useful to detect sub-systems that become coordination bottlenecks or lack a clear ownership, as shown in the next figure.
CodeScene takes the social concept further to support project planning with on- and off-boarding. For example, CodeScene’s simulation module lets you explore the effects of a planned off-boarding while the developers are still aboard. This gives you the opportunity to identify off-boarding risks and areas of the code in need of a new main developer. Note that none of that information is available in the code itself.
Explore More and try CodeScene
Check out our white paper to learn more about CodeScene, its use cases, and how they fit into your existing workflow and roles.