Sketching complexity with Groovy
Not long ago, I wrote up a nifty dashboarding application, Ã la Groovy, in an effort to abate the visual pain associated with report overload syndrome. These kinds of applications are perfect for languages like Groovy as you can knock them out in a matter of hours (including a test suite to verify kosherness).
One of the applications which adds to an increase in the number of reports one must digest is JavaNCSS. This disco tool analyzes a code base and reports everything relating to code length, including class sizes, method sizes, and the number of methods found in a class. What’s more, JavaNCSS reports method complexities, Cyclomatic style.

While all this data is helpful in some scenarios, it is probably too much to digest at a glace, man. For example, the report generated via Ant is shown to the right. That’s a lot of data, don’t you think?
Interestingly, one scenario came to mind that the data from JavaNCSS could provide– what is the distribution of complexity across the entire code base? Because it’s my bag, this actually can be helpful in understanding, at a high level, what’s going on. A “healthy” code base would not have a high degree of highly complex methods; consequently, a highly complex code base would have a high number of complex methods.
Yet, the default report generated via Ant doesn’t really tell this story– while the data is there, it doesn’t stand out. What’s needed is a visual guide that quickly demonstrates the distribution of complexity– in this case, something like a bar chart would work just fine.
Before a chart can be generated; however, the data needs to be mined. This is where Groovy comes in. Via its copasetic built-in XML parsing, I can quickly get all the data I need to understand the distribution of complexity. For example, I was interested in five different ranges of complexity:
- Methods with a CCN of 1
- Methods with a CCN between 2 and 5
- Methods with a CCN between 6 and 10
- Methods with a CCN over 10 but less than or equal to 20
- Everything greater than 20
These ranges could, of course, vary depending on what you’re trying to understand. In my case, I wanted to get a feel for the number of really simple methods (like getters and setters) and constructors (which JavaNCSS treats just like normal methods by reporting CCN). A healthy code base, in the Age of Aquarius, would probably have the majority of its methods falling within the first two buckets and hopefully not have any data in the last two. The middle one may have a few methods here and there.
Obtaining the method complexity distribution in a JavaNCSS XML report via Groovy is absurdly simple as this code snippet shows:
def range1 = 2..5
def range2 = 6..10
def range3 = 11..20
this.doc.functions.function.each{ mthd ->
int ccn = Integer.parseInt(mthd.ccn.text())
if(ccn == 1){
ones << mthd.name.text()
}else if (range1.contains(ccn)){
low << mthd.name.text()
}else if(range2.contains(ccn)){
medium << mthd.name.text()
}else if(range3.contains(ccn)){
midMax << mthd.name.text()
}else{
max << mthd.name.text()
}
}
Using Groovy’s range feature, I can define some simple ranges that correspond with my desired distribution. Plus, in a closure that iterates over each function element in the JavaNCSS XML document, I can obtain the element ccn’s value and place it within the proper collection via the handy << syntax.
With this hip data, I can then feed it to a charting utility, like JFreeChart, and generate a bar chart like so:
dataset.addValue(val1 * 100, disdta[0].name, disdta[0].name)
dataset.addValue(val2 * 100, disdta[1].name, disdta[1].name)
dataset.addValue(val3 * 100, disdta[2].name, disdta[2].name)
dataset.addValue(val4 * 100, disdta[3].name, disdta[3].name)
dataset.addValue(val5 * 100, disdta[4].name, disdta[4].name)
def chart = ChartFactory.createBarChart(
"Complexity Distribution",
"CCN Range",
"% Total Methods",
dataset,
PlotOrientation.VERTICAL,
false,
false,
false
)
In this snippet the val variables above are each multiplied by 100 to create a percentage value (i.e 66%). In earlier code, the total number of methods is obtained (i.e. 22,334) and each obtained collection value is divided by this number to create a ratio (i.e. 543/22,334). What’s more, after creating the chart instance, you can customize various aspects of the chart like its colors, etc. For example, to change the five bar’s colors, you can obtain a BarRenderer instance from the chart’s plot instance and set the series color as follows:
def rendr = plot.getRenderer()
rendr.setSeriesPaint(0, new Color(102,205,000))
rendr.setSeriesPaint(1, new Color(000,100,000))
rendr.setSeriesPaint(2, new Color(255,215,000))
rendr.setSeriesPaint(3, new Color(255,140,000))
rendr.setSeriesPaint(4, new Color(139,000,000))
Lastly, you can save the chart instance to a file too like so:
ChartUtilities.saveChartAsPNG(
new File("C:\\dev\\projects\\acme\\target\\mc.png"),
chart, 375, 200)

The resulting bar chart is shown to the right and displays the distribution of complexity across all methods within a code base. In this case, this code base has roughly 55% of its methods with a CCN of 1, man. One could infer that there are a lot of smokin’ JavaBean style classes, which in this case is true. A small portion of methods, unfortunately do have some high complexity values, which does cause some concern.
Of course, this is only a partial picture, right? This bar chart doesn’t tell me anything about the associated coverage of those complex methods and it’s only a snap shot in time, man– tomorrow, if this utility is run and the far right bar grew, you’d know that things are getting worse.
As you can see, Groovy is an excellent choice for generating simple reports as you can knock them out in a snap. Plus, by building intelligent charts, you can further help save people from report overload syndrome. Dig it?
| Related odds and ends | ||
|---|---|---|
Friday 30 Mar 2007 | Code Metrics, Dynamic Languages, Groovy
[...] Original post by Andy and a wordpress plugin by Elliott [...]
You might be interested in the open source Hackystat Project which explores how to more easily collect and integrate together software measures. In Hackystat, software “sensors” are attached to various software development tools and sent to a centralized server. Developers can run analyses that illuminate relationships between both process and product-oriented metrics. This makes it easy to investigate, for example, if variations in the level of code complexity are correlated with daily build failure.
There is a link on the home page to the Version 8 planning document, which might be of interest as this major release should make it much easier for developers such as yourself to “knock out” new kinds of reports using languages like Groovy. If that’s your bag, of course.
[...] Hon Daddy Dad wrote an interesting post today onHere’s a quick excerptThis disco tool analyzes a code base and reports everything relating to code length, including class sizes, method sizes, and the number of methods found in a class. What’s more, JavaNCSS reports method complexities, Cyclomatic style. … [...]
Philip– thanks for the pointer to Hackystat! This looks interesting indeed. The statement “if variations in the level of code complexity are correlated with daily build failure” is quite interesting as in a former life, we built a similar engine that reported on which lines of code changed if there was a test failure for a test that previously worked.
I look forward to checking out Hackystat– keep up the great work!