Code Metrics

Unambiguously analyzing metrics

Software metrics are objective measurements of particular aspects of code– for instance, Cyclomatic complexity measures complexity without any regard for why code contains a certain number of paths. For metrics to be useful, baby, they must be applied subjectively. In the case of complexity, there may be circumstances that warrant such code (although, I’ve yet to find complex code that still can’t benefit from refactoring). I’ve also found that, on the whole, metrics are more copasetic when combined with other metrics and trended– for instance, complexity alone is somewhat interesting, but pairing complexity with code coverage paints a much more detailed metric that bears understanding. High complexity with low coverage is clearly more risky than the same complexity with high code coverage– even the CRAP metric holds this relationship.

One particular hip metric that I find helpful is the ratio of copy and pasted code within a code base as unknown copy and pasted code will haunt you, man. For instance, copy and pasted code replicates bugs and poorly coded algorithms to name a few nefarious aspects; consequently, understanding what code has been replicated can help teams refactor offending code. Having run various copy and paste analyzers on more code bases than I care to admit, (and because it’s my bag) I’ve found that all code bases have a certain level of offending code that triggers a copy and paste detection. One particular tool, CPD, is nice enough to create a report containing the offending code like so:

<duplication lines="7" tokens="53">
 <file line="36" path="cbd4/blackjack/src/com/stelligent/blackjack/Hand.java"/>
 <file line="42" path="cbd4/blackjack/src/com/stelligent/blackjack/Hand.java"/>
 <file line="48" path="cbd4/blackjack/src/com/stelligent/blackjack/Hand.java"/>
 <codefragment>
 <![CDATA[
    } else if (first.equals("Jack")) {
      if(!second.equals("King") && !second.equals("Queen") && !second.equals("Jack")){
         return Integer.parseInt(second) + 10;
     } else {
      return 20;
    }
   }else if(first.equals("9")){
]]>
</codefragment>
</duplication>

As you can see, CPD reports the total number of lines of copy and pasted code and where that bogue code can be found. This data is certainly helpful; however, it doesn’t paint the entire story– while 7 lines doesn’t seem like all that much code, you’d probably reconsider if it were 7 lines of code in a 30 line code base or more realistically– 700 lines in a few thousand line code base. Therein lies the catch– CPD’s data is really only helpful when viewed on the whole (or a ratio– that is, total lines of copy and pasted code over total lines of code). Unfortunately, CPD doesn’t report the total lines of code scanned– only the total lines of copy and pasted code. For instance, in this sample code base, there were 9 suspected copy and pasted code fragments totaling about 120 lines of code (or CPLOC).

Luckily, there’s another handy tool which reports the total lines of code (or LOC) in a code base– JavaNCSS. Running JavaNCSS yielded a value of about 610 LOC; therefore the ratio of copy and pasted code is CPLOC/LOC or 120/610, which is roughly 20%.

20% CPLOC is probably a bad thing– at a minimum is is worth knowing about. 20% today might not be too important to know, but knowing that it increased to 25% next week would be an indication that things are degrading– likewise, seeing a value decrease over time indicates the code base is actively being improved. Yet, how can teams possibly monitor this trippin’ data?

Reports are hip, but in truth, reports by tools like CPD are essentially read once– the first time they are generated. After that, it’s anyone’s guess when someone will actively read the report again. Hence, I find it particularly helpful to essentially throw the report out and let the build itself proactively tell me when a particular metric gets out of hand. This essentially means that my build has to monitor a particular metric– and in the case of the CPLOC ratio, my build has to gather data from two sources– JavaNCSS’s report and CPD’s.

Fortunately, this is easy with Groovy– if your instance, you are using Ant for builds, you can first generate the two reports as follows:

<target name="cpd">
 <mkdir dir="target/reports"/>
 <taskdef name="cpd" classname="net.sourceforge.pmd.cpd.CPDTask"
   classpathref="classpath"/>
 <cpd minimumTokenCount="10" outputFile="target/reports/cpd.xml" format="xml">
  <fileset dir="src">
   <include name="**/*.java"/>
  </fileset>
 </cpd>
</target>

The code above generates a CPD XML report from all the code in a src directory and the following code creates a JavaNCSS report from the same code base:

<target name="javancss">
 <taskdef name="javancss" classname="javancss.JavancssAntTask"
   classpathref="classpath" />
 <javancss srcdir="src" generateReport="true"
   abortOnFail="true" ccnPerFuncMax="100"
   outputfile="target/reports/javancss_metrics.xml" format="xml" />
</target>

The only high-level step left to do is to put the two metrics together; however, this step actually takes a few sub-steps, baby. For instance, obtaining the total lines of CPLOC requires iterating over a collection of duplication elements in the CPD xml file. Consequently, the following steps detail the effort required to obtain this metric:

  • parse the JavaNCSS xml report and obtain the total LOC
  • parse the CPD xml report and obtain the total CPLOC
  • divide the two and compare the result to some threshold
  • if the threshold is exceeded, fail the build

Groovy, by the way, is particularly well suited for such a task (as if you didn’t know that, man?)– parsing XML with Groovy is practically effortless– like disco dancing, eh? For instance, obtaining the total LOC from JavaNCSS’s xml file is as easy as

int ncss = Integer.parseInt(jncssroot.packages.total.ncss.text())

Note, I’m coercing integer values as I’d like to divide (and round) my result– if I don’t explicitly specify int’s I’ll be left with String division, which doesn’t work so well.

Parsing CPD’s xml document is slightly more complex– slightly in that it takes 3 times as much code:

def cpdtot = 0
cpdroot.duplication.each { elem ->
 cpdtot += Integer.parseInt(elem.@lines.text())
}

Again, parsing an XML document yields String values; accordingly, I need to use Integer’s parseInt method.

Next, all I need to do is divide the two and, in my case, I’m aggressively rounding up via Java’s ceiling call as follows:

def ratio = Math.ceil((cpdtot / ncss) * 100)

Multiplying the result by 100 gives me a percentage value, of course, and lastly, I compare that to a threshold value:

if(ratio > Double.parseDouble(properties.cpd_threshold)){
  ant.fail(message:
   "cut and paste ratio was greater than ${properties.cpd_threshold}%, it was ${ratio}%")
}

Puttin’ it all together, baby, yields a groovy Ant script with a hip target:

<target name="cpd-threshold" depends="metrics">
 <groovy>
 def jncssroot = new XmlSlurper().parse("target/reports/javancss_metrics.xml")
 int ncss = Integer.parseInt(jncssroot.packages.total.ncss.text())

 def cpdroot = new XmlSlurper().parse("target/reports/cpd.xml")

 def cpdtot = 0
 cpdroot.duplication.each { elem ->
  cpdtot += Integer.parseInt(elem.@lines.text())
 }

 def ratio = Math.ceil((cpdtot / ncss) * 100)

 if(ratio > Double.parseDouble(properties.cpd_threshold)){
  ant.fail(message:
   "cut and paste ratio was greater than ${properties.cpd_threshold}%, it was ${ratio}%")
 }
 </groovy>
</target>

Reports are hip, but they are usually only read once– the first time they are generated. Rather than waiting to find out that there’s a problem, proactively analyzing a hip metric (such CPLOC/LOC) enables rapid feedback and rapid corrections– is that unambiguous or what?

Chewing the fat over cyclomatic complexity

The hip folks at Enerjy talked with a copasetic crowd recently asking their thoughts on cyclomatic complexity. It seems most (including this disco dancer) find that CCN is an excellent indicator of risk– I haven’t found a better metric yet– in fact, a recent addition to the hip metrics crowd, dubbed C.R.A.P, which was donated by the folks at Agitar, even builds upon CCN by combining code coverage. What do you think, man?

This podcast doesn’t stink, man

I recently had a copasetic conversation with Alberto Savoia regarding the hip CRAP metric– my parents would be appalled with our language (I think the word in question is used at least 135 times); however, we had a good time discussing the efficacy of the metric, its future, and of course, its malodorously applied name.

In short, the CRAP metric effectively marries code coverage with cyclomatic complexity and an effort to delineate risk associated with change. Have a listen (courtesy of JavaWorld) and stay tuned for more podcasts, baby!

Solving the code coverage dilemma with Emma

Because it’s my bag, I pointed out recently that copasetic coverage tools (like Cobertura) can inadvertently hide defects by reporting specific lines of code as covered. But, while I often use Cobertura in my examples, I have found that Emma is fairly smart in its reporting of code coverage values. As such, I often find myself running both hip tools for projects.

For example, the same branchIt method from my previous posting is displayed slightly different in Emma as shown below. Specifically, note line 10– it’s colored yellow in an attempt to show that not all conditions of the conditional were executed.


(Click the picture to view a larger version)

While Emma probably reports coverage more accurately (note, it reports block coverage), I still find Cobertura’s reports more aesthetically pleasing. In truth, I’m a firm believer that coverage reports are more effective at telling you what’s not covered; accordingly, both tools are quite accurate in this regard. Can you dig it, man?

Short-circuiting code coverage

As I’ve written about before, code coverage numbers can be misleading. 100% line coverage and 100% branch coverage doesn’t necessarily mean your code is defect free– all it means is that that code was touched. In fact, code coverage values are far more effective at telling you what’s not covered by a test, man.

There are actually quite a few hip different ways that coverage values can be misleading. One particular coding construct can easily mislead the unsuspecting eye: the short-circuit operator. These operators, such as the short-circuit AND (&&) and OR (||) are quite handy in conditionals. For example, here’s a fictional code snippet with a copasetic OR short-circuit operator in the first conditional.

public void branchIt(int value){
 if((value > 100) || (HiddenObject.doWork() == 0)){
  this.doIt();
 }else{
  this.doItAgain();
 }
}  

In the snippet above, both the doIt and the doItAgain methods aren’t particularly important– what’s key here is the second part of the if conditional. Let’s imagine HiddenObject is a 3rd party API call, an object in another package, etc– the point being, you don’t necessarily know how doWork does its work, you just know that because it’s its bag that if it returns 0 it is a valid condition.

It just so turns out that the doWork method on the HiddenObject method isn’t perfect– it can throw an exception. I’ll force one, but the scenario demonstrates a trippin’ point.

public static int doWork(){
 throw new RuntimeException("surprise!");
}

Thus far we know that if the doWork method is executed an exception will be thrown. Imagine I don’t know that though. Let me write a quick test (via JUnit) to make sure things are working.

public final void testBranchIt() {
 AnotherBranchCoverage clzzUnderTst = new AnotherBranchCoverage();
 clzzUnderTst.branchIt(101);
}

When I run this test, things work fine. I can even check out the coverage values as reported by Cobertura.

Not bad– I’ve got reasonable line coverage here and 100% branch coverage. It turns out the 100% value for branch coverage is a slight defect within Cobertura, but regardless, I’ve got copasetic coverage, don’t I? Check it out, man, the if statement was touched! Because of the short-circuit OR, I triggered a true condition via the 101 value; accordingly, the second clause was short-circuited. The 75% line coverage makes sense too– I failed to execute the else block, hence the other method within the class wasn’t touched and the line coverage value was accordingly deducted.

As you can see, coverage reports are hip in ascertaining what’s not tested (which, in this case, would practically force me to execute the 2nd condition in the if) but don’t depend too heavily on them telling you what’s tested. Otherwise you could end up short-circuiting yourself into a false sense of security…or is that coverage?

Sketching complexity with Groovy

Not long ago, I wrote up a nifty dashboarding application, à la Groovy, in an effort to abate the visual pain associated with report overload syndrome. These kinds of applications are perfect for languages like Groovy as you can knock them out in a matter of hours (including a test suite to verify kosherness).

One of the applications which adds to an increase in the number of reports one must digest is JavaNCSS. This disco tool analyzes a code base and reports everything relating to code length, including class sizes, method sizes, and the number of methods found in a class. What’s more, JavaNCSS reports method complexities, Cyclomatic style.

While all this data is helpful in some scenarios, it is probably too much to digest at a glace, man. For example, the report generated via Ant is shown to the right. That’s a lot of data, don’t you think?

Interestingly, one scenario came to mind that the data from JavaNCSS could provide– what is the distribution of complexity across the entire code base? Because it’s my bag, this actually can be helpful in understanding, at a high level, what’s going on. A “healthy” code base would not have a high degree of highly complex methods; consequently, a highly complex code base would have a high number of complex methods.

Yet, the default report generated via Ant doesn’t really tell this story– while the data is there, it doesn’t stand out. What’s needed is a visual guide that quickly demonstrates the distribution of complexity– in this case, something like a bar chart would work just fine.

Before a chart can be generated; however, the data needs to be mined. This is where Groovy comes in. Via its copasetic built-in XML parsing, I can quickly get all the data I need to understand the distribution of complexity. For example, I was interested in five different ranges of complexity:

  • Methods with a CCN of 1
  • Methods with a CCN between 2 and 5
  • Methods with a CCN between 6 and 10
  • Methods with a CCN over 10 but less than or equal to 20
  • Everything greater than 20

These ranges could, of course, vary depending on what you’re trying to understand. In my case, I wanted to get a feel for the number of really simple methods (like getters and setters) and constructors (which JavaNCSS treats just like normal methods by reporting CCN). A healthy code base, in the Age of Aquarius, would probably have the majority of its methods falling within the first two buckets and hopefully not have any data in the last two. The middle one may have a few methods here and there.

Obtaining the method complexity distribution in a JavaNCSS XML report via Groovy is absurdly simple as this code snippet shows:

def range1 = 2..5
def range2 = 6..10
def range3 = 11..20

this.doc.functions.function.each{ mthd ->
 int ccn = Integer.parseInt(mthd.ccn.text())

 if(ccn == 1){
  ones << mthd.name.text()
 }else if (range1.contains(ccn)){
  low << mthd.name.text()
 }else if(range2.contains(ccn)){
  medium << mthd.name.text()
 }else if(range3.contains(ccn)){
  midMax << mthd.name.text()
 }else{
  max << mthd.name.text()
 }
}

Using Groovy’s range feature, I can define some simple ranges that correspond with my desired distribution. Plus, in a closure that iterates over each function element in the JavaNCSS XML document, I can obtain the element ccn’s value and place it within the proper collection via the handy << syntax.

With this hip data, I can then feed it to a charting utility, like JFreeChart, and generate a bar chart like so:

dataset.addValue(val1 * 100, disdta[0].name, disdta[0].name)
dataset.addValue(val2 * 100, disdta[1].name, disdta[1].name)
dataset.addValue(val3 * 100, disdta[2].name, disdta[2].name)
dataset.addValue(val4 * 100, disdta[3].name, disdta[3].name)
dataset.addValue(val5 * 100, disdta[4].name, disdta[4].name)

def chart = ChartFactory.createBarChart(
 "Complexity Distribution",
 "CCN Range",
 "% Total Methods",
 dataset,
 PlotOrientation.VERTICAL,
 false,
 false,
 false
)

In this snippet the val variables above are each multiplied by 100 to create a percentage value (i.e 66%). In earlier code, the total number of methods is obtained (i.e. 22,334) and each obtained collection value is divided by this number to create a ratio (i.e. 543/22,334). What’s more, after creating the chart instance, you can customize various aspects of the chart like its colors, etc. For example, to change the five bar’s colors, you can obtain a BarRenderer instance from the chart’s plot instance and set the series color as follows:

def rendr = plot.getRenderer()

rendr.setSeriesPaint(0, new Color(102,205,000))
rendr.setSeriesPaint(1, new Color(000,100,000))
rendr.setSeriesPaint(2, new Color(255,215,000))
rendr.setSeriesPaint(3, new Color(255,140,000))
rendr.setSeriesPaint(4, new Color(139,000,000))

Lastly, you can save the chart instance to a file too like so:

ChartUtilities.saveChartAsPNG(
  new File("C:\\dev\\projects\\acme\\target\\mc.png"),
    chart, 375, 200)                            

The resulting bar chart is shown to the right and displays the distribution of complexity across all methods within a code base. In this case, this code base has roughly 55% of its methods with a CCN of 1, man. One could infer that there are a lot of smokin’ JavaBean style classes, which in this case is true. A small portion of methods, unfortunately do have some high complexity values, which does cause some concern.

Of course, this is only a partial picture, right? This bar chart doesn’t tell me anything about the associated coverage of those complex methods and it’s only a snap shot in time, man– tomorrow, if this utility is run and the far right bar grew, you’d know that things are getting worse.

As you can see, Groovy is an excellent choice for generating simple reports as you can knock them out in a snap. Plus, by building intelligent charts, you can further help save people from report overload syndrome. Dig it?

Curtail complexity with a rules engine

Complexity can manifest itself within a software application in a number of hip ways, including dependency management (i.e. 3rd party libraries required for runtime, etc), architectural adherence patterns (think old style EJBs), and even coding constructs (in particular, excessive use of conditionals). When it comes to coding constructs, the resulting complexity is often related to the problem being solved. For example, imagine a recommendation wizard for sales associates selling hip disco LPs. Quite simple, right? You have two groovy choices– anything from Donna Summer or the Bee Gees. If only life were this easy, eh?

Now imagine a recommendation wizard for smokin’ sales associates trying to move beer. Now that’s more real, isn’t it? Imagine the store is trying to move (i.e. sell to customers) seven different types of beers– all varying in taste and characteristics. The store wants to develop an application that walks someone through a series of questions and based upon their answers, will recommend one of seven beers. Think of this application as an expert beer system– and while it may start with only seven beers, over time more beers will be added, especially if the system proves itself to move beers efficiently. What’s more, you the developer aren’t a beer aficionado (i.e. a domain expert on beers)– your job is to make an application that beer experts can modify so customers can pick beers more easily.

Logically, you can build a hip beer expert system with a couple of conditionals– if you like this characteristic, then you should buy this beer, right? In pseudocode, your logic could look like this (after you’ve had a beer session with the beer experts):

Do you like a light beer or a dark beer?
 if light beer:
  Do you like crisp, smooth beers or more prefer a more hoppy one?
   if crisp:
    then Pils
   else:
    Do you like light hops or more aggressive hops?
      if light:
       then Pale Ale
      else:
       IPA
else:
  Do you like the taste of coffee?
   if yes:
     Chocolate Stout
   else:
     Do you like spiciness?
       if yes:
         try Winter Ale
       else:
          Do you like high alcohol content?
            if yes:
              try an Eisbock
            else:
              try a Lager

This particular block of code (which enables one to pick one of seven beers and is by no means an accurate expert system), if isolated in a method, would have a cyclomatic complexity of at least 13, which presents a challenge– methods over 10, with conditional nesting, are havens for defects, especially if this code changes often. What if next week, the Pilsner brand is sold out? You’ll need to modify the logic to select perhaps another type of beer. In fact, the logic may not be as easy as replacing the Pilsner with another neat-o beer– it may involve a new series of questions.

It turns out that in these scenarios, a rules engine may actually be beneficial– in fact, rules engines (or expert systems) are well suited to replace excessive if, else, switch logic, especially if that logic is the domain of non-technical experts (in the case above, the beer experts haven’t a clue about coding nor hygiene, for that matter).

Using a rules engine, however, requires you to flatten business logic somewhat; in fact, in the copasetic beer expert system above, it requires you to focus on particular goals (i.e. moving a particular beer brand) and work backwards from that. For example, if I want to move an IPA, the attributes are:

  • Likes a light beer as opposed to a dark one
  • Prefers a hoppy taste
    • And tends to like a more aggressively hopped one too

Keep in mind, that in a real expert system for making recommendations, the number of attributes would most likely be greater. Based on the attributes of beer elaborated in the pseudocode above, however, I can group them into three categories, which I’ll designate as Java 5 enums:

public enum Color {
  LIGHT, DARK
}
public enum Taste {
  CRISP, HOPPY, AGGRESSIVE_HOPS,
  LIGHT_HOPS, COFFEE, SPICY, MALTY
}
public enum ABV {
  HIGH_ALCOHOL, NORMAL_ALCOHOL,
  LIGHT_ALCOHOL, NO_ALCOHOL
}

These enumerations will live inside of a BeerPreference object holds a Color, a Collection of Testes, and an ABV:

public class BeerPreference {
 private Color color;
 private Collection  tastes;
 private ABV abv;
 //...
}

The class will also hold a recommendedBeer property, which the rules engine will appropriately set based upon the other attributes’ values:

private String recommendedBeer;

public String getRecommendedBeer() {
  return recommendedBeer;
}
public void setRecommendedBeer(String recommendedBeer) {
 this.recommendedBeer = recommendedBeer;
} 

In my case, I’ll use Drools, which is an excellent open source expert system, to define my rules. For example, below is a tripped out rule for determining if the choices present mean a person should try out an IPA.

rule "Mendocino White Hawk IPA Rule"
 when
   $beer: BeerPreference(color == Color.LIGHT,
   tastes contains Taste.HOPPY,
   tastes contains Taste.AGGRESSIVE_HOPS,
   tastes excludes Taste.SPICY,
   tastes excludes Taste.COFFEE)
then
   $beer.setRecommendedBeer("Mendocino White Hawk IPA");
end

Note that the copasetic rules syntax isn’t too hard to pick up– it’s quite logical: if the BeerPreference’s color property is light and the collection of Tastes includes Taste.HOPPY and Taste.AGGRESSIVE_HOPS and also doesn’t contain Taste.SPICY and Taste.COFFEE, then the rule engine will take the BeerPreference instance (which is $beer) and set the recommended beer to "Mendocino White Hawk IPA" (which, by the way, is an excellent beer). Drool’s rules syntax is simple– object attributes are obtained via their proper name, rather than by a getter method, logical ands are denoted via commas and binding variables is done via the : operator.

Testing rules is most easily done via table based frameworks like Fit. Writing tests via JUnit or TestNG, while possible, can become laborsome due to the number of combinations one must code. Nevertheless, I can code a simple sunny-day scenario test case via JUnit to demonstrate Drool’s in action.

First, I must initialize Drools, which involves loading my rules (find in the file beer-guide.drl) and adding them to a Drool’s RuleBase like so:

public class BeerPreferenceTest {
 private static RuleBase ruleBase;

 @BeforeClass
 public static void setUpBeforeClass() throws Exception {
  Reader source =
   new InputStreamReader(
     BeerPreference.class.getResourceAsStream("beer-guide.drl"));

  PackageBuilder builder = new PackageBuilder();
  builder.addPackageFromDrl(source);
  final Package pkg = builder.getPackage();
  BeerPreferenceTest.ruleBase = RuleBaseFactory.newRuleBase();
  BeerPreferenceTest.ruleBase.addPackage(pkg);
 }
}

Now that Drool’s is read to go and because it’s my bag, I can create an instance of WorkingMemory and pass in my BeerPreference instance. Remember, you must call the fireAllRules method on your WorkingMemory instance to force things to happen.

@Test
public void verifyIPA() throws Exception{
 WorkingMemory workingMemory =
 BeerPreferenceTest.ruleBase.newWorkingMemory();

 BeerPreference beer = new BeerPreference();
 beer.setColor(Color.LIGHT);
 beer.addTastePreference(Taste.HOPPY);
 beer.addTastePreference(Taste.AGGRESSIVE_HOPS); 

 workingMemory.assertObject(beer);
 workingMemory.fireAllRules();

 assertEquals("Should be Mendocino White Hawk IPA",
   "Mendocino White Hawk IPA",
    beer.getRecommendedBeer());
}

Using a hip rules engine doesn’t necessarily reduce complexity– it just isolates portions of it into a format that can be manipulated by non programmers. In essence, a rules engine creates flexibility, while also providing for more testability. Note how in the test above, I was able to isolate my logic for IPAs without having to deal with any of the other six beers. With normal conditionals, I might have had to concern myself with the other choices, so as to force the IPA one. Luckily, my logic is quite simple so this testing challenge may not be entirely apparent.

If you find excessive logic that’s bag:

  • Changes often
  • Is the privy of domain experts who don’t write the code

then you may want to look into an expert system, which can centralize human-readable logic into one location. Rules engines aren’t a sliver bullet nor are they perfect for all scenarios; however, if applied correctly, they can decrease conditional complexity quite nicely.

Currying maximum favor with Groovy

As a side project to building a small build results dashboard with Groovy, I found myself writing a copasetic application, which analyzed a code base using JDepend’s programmatic API. In both instances, I found myself needing to determine the maximum numeric value in a collection. Now, in Groovy (and in any language, for that matter), there is a cornucopia of ways to obtain a maximum value– either through brute force logic or more creatively hip mechanisms.

For example, in my dashboard application, I needed to determine the maximum cyclomatic complexity in a code base as reported by JavaNCSS. The output of JavaNCSS is an XML report with varying pieces; however, the section I found myself concentrating on the following pattern:

<functions>
 <function>
  <name>com.acme.GRBuilder.GRBuilder()</name>
  <ncss>2</ncss>
  <ccn>1</ccn>
  <javadocs>1</javadocs>
 </function>
 <function>
  <name>com.acme.GRBuilder.addConf(FileSet)</name>
  <ncss>2</ncss>
  <ccn>1</ccn>
  <javadocs>1</javadocs>
 </function>
.....
</functions>

XML parsing is essentially built into the Groovy language and acts like a far out object graph; hence, traversing XML is just like navigating through an object– for instance, once I have the root node of the document I can, for example, grab the list of function elements in the XML above with doc.functions.function. In this case, doc is pointing to the root node.

Obtaining the maximum ccn element value from the XML above can then be achieved through brute force as follows:

private int getMaxComplexity(){
 def doc = this.getParsedJavaNCSS()
 if(doc != null){
  int maxccn = 0
  doc.functions.function.each{ mthd ->
   int ccn = Integer.parseInt(mthd.ccn.text())
   if (ccn > maxccn){
      maxccn = ccn
   }
  }
  return maxccn
 }else{
  return 0
 }
}

In the code above, the method getParsedJavaNCSS returns the root node of a JavaNCSS XML report. Next, a neat-o counter value is set to zero and for each function element, the ccn child element’s value is obtained (via mthd.ccn.text()). This value is compared to the counter and obviously, if the counter is less than the obtained value, the counter is updated. Dig it?

The code above works, but there is another way to obtain a maximum value from a collection (in fact, there are a few more ways) that is a bit more interesting. By using curried closures, you can create a reusable mechanism for determining the maximum value from a collection. This is key benefit– now you can put logic specific to determining a maximum value into a reusable object that other methods can now reference, rather than having a bunch of methods executing logic like I have listed above.

A curried closure is a closure that had multiple parameters and which has had some of those parameters already seeded with a smokin’ value. This seeding of values creates, in essence, a new closure albeit with some default values. Currying doesn’t make a lot of sense until you see in action; accordingly, the code below defines a simple closure that compares two values and returns which ever one has the greater value.

maxValueClosure = { x, y ->
 if(x >= y){
  return x
 }else{
  return y
 }
}

Currying this closure with one value effectively sets the x parameter with that value; hence, once you curry this closure, you’ll need to reference the newly created closure (which only takes one value– and that value will be y). For example, because it’s my bag, I can curry the maxValueClosure with 0 and then pass in another value to see what value is returned. If that value is greater than 0, I’d expect that value to be returned.

def maxValueClosure

void setUp(){
 this.maxValueClosure = { x, y ->
  if(x >= y){
   return x
  }else{
   return y
  }
 }
}

void testSimpleCurry() {
 def seed = this.maxValueClosure.curry(0)
 def value = seed(9)
 assertEquals(value, 9)
}

In the code above, the line this.maxValueClosure.curry(0) effectively creates a new closure that looks like this:

seed = { 0, y ->
  if(0 >= y){
   return 0
  }else{
   return y
  }
 }

Obviously, the first test is so establishment.; however, a more complex test will make currying more clear.

void testCurryingWithCollection() {                     

 def seed = this.maxValueClosure.curry(0)

 def values = [6,7,22,1,7,88,3,0,44]
 def max
 values.each{
  max = seed(it)
  seed = this.maxValueClosure.curry(max)
 }
 assertEquals(88, max)
}

As you can see, the curried closure correctly kept 88 as the max value– the it reference in the values iteration is a short cut in Groovy for the current value of the collection. For each iteration of the values collection, the curried closure’s y value was set to the current element and compared with the previously curried value. During the first iteration, the x value was 0 and the y value became 6. The comparison was made, which returned 6. The maxValueClosure was then curried again, but this time with the value 6. So upon the second iteration, 6 and 7 were compared and the process repeated, but with 7 now becoming the x value.

Of course, this collection could have first been sorted and then obtaining the maximum value would be a matter of popping off the last value; however, in real life, I am not dealing with simple Integer collections– I’m dealing with collections of objects that have methods attached which return numeric values.

Determining the maximum value for a particular JDepend metric then becomes an exercise in currying the maximum value obtaining closure during iteration of a desired metric. If you are not familiar with JDepend’s API, with it you can obtain a collection of packages and then call various hip methods on a particular package, like efferentCoupling(), afferentCoupling(), and instability() to name a few.

Accordingly, I’ve defined two methods– one which returns the maxValueClosure and another which determines the maximum value of any metric on a boss package.

def maxValueClosure(){
 def maxValue = { x, y ->
  if(x >= y){
   return x
  }else{
   return y
  }
 }
 return maxValue
}       

def maxMetricCalculation(methodName){
 def maxValue = this.maxValueClosure()
 def seed = maxValue.curry(0)
 def max
 this.analyzer.getPackages().each{
  if(it.getClassCount() > 0){
   max = seed(it.invokeMethod(methodName, null))
   seed = maxValue.curry(max)
  }
 }
 return max
}

The maxMetricCalculation method does a number of things similar to some of the test cases listed above with the added convenience of generically calling any method hanging off of JDepend’s JavaPackage object. Dynamically calling a method on an object in Groovy is done via invokeMethod, which takes two parameters– the method name and an Object array of that method’s parameter values. Because the methods I’m invoking take no parameters, I pass in null.

For example, to obtain the maximum instability of all analyzed packages in a code base, man, I can define a method like so:

float maxInstability(){
 return this.maxMetricCalculation("instability")
}

As I said earlier, there are other ways to obtain a maximum value from a collection; in fact, Groovy defines a max method, which one can call on a collection and even pass in a closure or a java.util.Comparator. Using curried closures can add a nice bit of reuse in other wise uptight cut-and-paste like code; moreover, Groovy’s dynamic behavior makes it quite easy to take generic-ness even farther with its invokeMethod API. Now how’s that for currying some maximum favor with Groovy, man?

Groovy’s the elixir for report overload syndrome

Not long ago, I posted a poll regarding code metrics in which the majority of votes settled on two, not too surprising, copasetic points:

  1. Some people are not sure what the data means
  2. Others aren’t sure which tools to use to obtain the data

Issue #1 spawned a posting a few weeks back regarding the meaning of code metrics; however, issue #2 got me thinking– there are quite a few different tools out there that gather diverse metrics, yet there are few opportunities to effectively view them. For instance, for Java projects, I often find myself recommending people use JavaNCSS, Cobertura, and PMD to name a few hip tools. In fact, in all, one can categorize various tools’ outputs into seven metric types:

This list, by the way, is composed of Paul Duvall’s Big Five code analysis areas– because it’s my bag, I’ve added test results and code size.

But looking at the list above reveals over six different reports, each with different formats and variations on how data is visually displayed. This cornucopia of data often leads to report overload syndrome, in which, because of information overload, the data has the tendency to become ignored (not much unlike the lack of an Oscar for the ever so hip Saturday Night Fever).

As I was sitting on a plane recently I found myself looking for an easy way to disseminate the valuable data from the categories above in an effective manner. I ended up with a design of a small table that displays summary data from the various tools and which provides links to the actual tool’s report for a more in depth analysis, should one feel it warranted.

dashboard

As you can see from the image above, the table summarizes the output of seven tools used in a typical build:

  • Test results from JUnit or TestNG
  • JavaNCSS’s count of classes and lines of code as well as the maximum cyclomatic complexity found in an individual method
  • Cobertura’s line and branch coverage
  • The count of FindBugs violations as well as PMD’s count
  • JDepend’s maximum reported Afferent and Efferent coupling
  • The amount of code which is found to be similar as reported by Simian

I ended up building this report via Groovy, which provided an excellent infrastructure for easily parsing the reports from various tools and building a resulting XML document. For example, via Groovy’s hip MarkupBuilder, creating the resulting XML is as easy as writing:

String generateReport(){
 def writer = new StringWriter()
 def builder = new MarkupBuilder(writer)

 builder.analysis() {
  project(name:"${projectName}"){
    build(label:"${buildLabel}",time:"${buildTime}"){
      code_size(){
          classes(this.getClasses())
          loc(this.getLOC())
      }
      tests(){
          tests_run(this.getTestsRun())
          failures(this.getFailures())
          branch_coverage(this.getBranchCoverage())
          line_coverage(this.getLineCoverage())
      }
      static_analysis(){
          pmd_violations(this.getPMDViolationCount())
          findbugs_violations(this.getFindBugsViolationCount())
      }
      code_metrics(){
          code_duplication(this.getCodeDuplication())
          max_complexity(this.getMaxComplexity())
          max_ca(this.getMaxCa())
          max_ce(this.getMaxCe())
     }
    }
   }
  }
  return writer.toString()
}

Of course, all the parsing is done else where; what’s more, Groovy’s easy Ant integration proved to facilitate using the newly created reporting application with ease. For example, the Groovy application which builds the report is a jar file that other projects then utilize as follows:

<target name="dashboard" depends="groovy-init,all-inspect">
 <groovy classpathref="build.classpath">
  import org.discoblog.merlin.metrics.report.dashboard.Dashboard
  import org.discoblog.merlin.metrics.report.dashboard.tools.ToolProperties

  def fullpath = "${properties.basedir}/${properties.defaulttargetdir}"

  def dashboard = new Dashboard(projectName:"${pname}",
     buildLabel:"${label}", buildTime:"${new Date()}")

  //tool properties initialization and subsequent
  //adding to dashboard instance...

  def dashpath = "${fullpath}/dashboard.xml"

  new File(dashpath).write(dashboard.generateReport())

  ant.xslt(in:dashpath,
    out:"${properties.defaulttargetdir}/dashboard.html",
    style:"${properties.defaulttargetdir}/lib/report-style.xsl")
 </groovy>
</target>

As you can see from the code above, Groovy marries nicely with Ant and the resulting task (which of course relies on all the previously mentioned 7 reports are run) builds an HTML file like the one displayed above.

Now when a full build is run, rather than sifting through 7 different reports, I can simply examine one report and determine for myself if I’d like to dig deeper (say for instance, there is a test failure or coverage dropped from my previous examination). Hopefully this little report will relieve that jive turkey report overload syndrome and help people in making sense of code metrics. Neat-o, man!

Aggregate Cyclomatic complexity is meaningless

Recently, there have been a number of hip online discussions regarding code metrics and their associated value. There have been some excellent points made; however, because it’s my bag, I did notice an apparent misunderstanding when it comes to Cyclomatic complexity. This metric only has meaning in the context of a single method. Mentioning that a class has a Cyclomatic complexity of X is essentially useless.

Because Cyclomatic complexity measures pathing in a method, every method has at least a Cyclomatic complexity of 1, right? So, the following getter method has a CCN value of 1:

public Account getAccount(){
   return this.account;
}

It’s clear from this boogie method that account is a property of this class. Now imagine that this class has 15 properties and follows the typical getter/setter paradigm for each property and those are the only methods available. That means the class has 30 simple methods, each with a Cyclomatic complexity value of 1. The aggregate value of the class is then 30.

Does this value have any meaning, man? Of course, watching it over time may yield something interesting; however, on its own, as an aggregate value, it is essentially meaningless. 30 for the class means nothing, 30 for a method means something though.

The next time you find yourself reading a copasetic aggregate Cyclomatic complexity value for a class, make sure you understand how many methods the class contains. If the aggregate Cyclomatic complexity value of a class is 200– it shouldn’t raise any red flags until you know the count of methods. What’s more, if you find that the method count is low yet the Cyclomatic complexity value is high, you will almost always find the complexity localized to a method. Right on!

Next »