January 2007

Preventive programming à la AOP

While defensive programming techniques (like checking if a parameter is null or not) effectively guarantees the condition of a method’s input, these conditionals become repetitive if they’re used across a wide expanse of code. In this month’s In pursuit of code quality series on IBM developerWorks, I introduce an easier way to add reusable validation constraints to your code using the power of AOP, design by contract, and a handy library called OVal. Check out “Defensive programming with AOP” and don’t forget to boogie over to the “Improve Your Java Code Quality” forum and let it all hang out, man!

Groovy’s the elixir for report overload syndrome

Not long ago, I posted a poll regarding code metrics in which the majority of votes settled on two, not too surprising, copasetic points:

  1. Some people are not sure what the data means
  2. Others aren’t sure which tools to use to obtain the data

Issue #1 spawned a posting a few weeks back regarding the meaning of code metrics; however, issue #2 got me thinking– there are quite a few different tools out there that gather diverse metrics, yet there are few opportunities to effectively view them. For instance, for Java projects, I often find myself recommending people use JavaNCSS, Cobertura, and PMD to name a few hip tools. In fact, in all, one can categorize various tools’ outputs into seven metric types:

This list, by the way, is composed of Paul Duvall’s Big Five code analysis areas– because it’s my bag, I’ve added test results and code size.

But looking at the list above reveals over six different reports, each with different formats and variations on how data is visually displayed. This cornucopia of data often leads to report overload syndrome, in which, because of information overload, the data has the tendency to become ignored (not much unlike the lack of an Oscar for the ever so hip Saturday Night Fever).

As I was sitting on a plane recently I found myself looking for an easy way to disseminate the valuable data from the categories above in an effective manner. I ended up with a design of a small table that displays summary data from the various tools and which provides links to the actual tool’s report for a more in depth analysis, should one feel it warranted.

dashboard

As you can see from the image above, the table summarizes the output of seven tools used in a typical build:

  • Test results from JUnit or TestNG
  • JavaNCSS’s count of classes and lines of code as well as the maximum cyclomatic complexity found in an individual method
  • Cobertura’s line and branch coverage
  • The count of FindBugs violations as well as PMD’s count
  • JDepend’s maximum reported Afferent and Efferent coupling
  • The amount of code which is found to be similar as reported by Simian

I ended up building this report via Groovy, which provided an excellent infrastructure for easily parsing the reports from various tools and building a resulting XML document. For example, via Groovy’s hip MarkupBuilder, creating the resulting XML is as easy as writing:

String generateReport(){
 def writer = new StringWriter()
 def builder = new MarkupBuilder(writer)

 builder.analysis() {
  project(name:"${projectName}"){
    build(label:"${buildLabel}",time:"${buildTime}"){
      code_size(){
          classes(this.getClasses())
          loc(this.getLOC())
      }
      tests(){
          tests_run(this.getTestsRun())
          failures(this.getFailures())
          branch_coverage(this.getBranchCoverage())
          line_coverage(this.getLineCoverage())
      }
      static_analysis(){
          pmd_violations(this.getPMDViolationCount())
          findbugs_violations(this.getFindBugsViolationCount())
      }
      code_metrics(){
          code_duplication(this.getCodeDuplication())
          max_complexity(this.getMaxComplexity())
          max_ca(this.getMaxCa())
          max_ce(this.getMaxCe())
     }
    }
   }
  }
  return writer.toString()
}

Of course, all the parsing is done else where; what’s more, Groovy’s easy Ant integration proved to facilitate using the newly created reporting application with ease. For example, the Groovy application which builds the report is a jar file that other projects then utilize as follows:

<target name="dashboard" depends="groovy-init,all-inspect">
 <groovy classpathref="build.classpath">
  import org.discoblog.merlin.metrics.report.dashboard.Dashboard
  import org.discoblog.merlin.metrics.report.dashboard.tools.ToolProperties

  def fullpath = "${properties.basedir}/${properties.defaulttargetdir}"

  def dashboard = new Dashboard(projectName:"${pname}",
     buildLabel:"${label}", buildTime:"${new Date()}")

  //tool properties initialization and subsequent
  //adding to dashboard instance...

  def dashpath = "${fullpath}/dashboard.xml"

  new File(dashpath).write(dashboard.generateReport())

  ant.xslt(in:dashpath,
    out:"${properties.defaulttargetdir}/dashboard.html",
    style:"${properties.defaulttargetdir}/lib/report-style.xsl")
 </groovy>
</target>

As you can see from the code above, Groovy marries nicely with Ant and the resulting task (which of course relies on all the previously mentioned 7 reports are run) builds an HTML file like the one displayed above.

Now when a full build is run, rather than sifting through 7 different reports, I can simply examine one report and determine for myself if I’d like to dig deeper (say for instance, there is a test failure or coverage dropped from my previous examination). Hopefully this little report will relieve that jive turkey report overload syndrome and help people in making sense of code metrics. Neat-o, man!

The weekly bag– Jan 26

Ruby refactor city

Because I like refactoring and I can not lie, I was quite excited to see that a group of disco Ruby stars have begun writing a Ruby edition of Martin Fowler’s seminal work “Refactoring: Improving the Design of Existing Code“– thus far they’ve published chapter 1, which is much like the original, just with Ruby code instead of Java.

Keep an eye on rubypatterns.com for updates– hopefully at some point they’ll add an RSS feed!

The weekly bag– Jan 19

Here are this week’s disco links:

Refactoring in NAnt generic-ness

I recently needed to create a copasetic NAnt script that handled database schema management– i.e. the script would create a database and populate it with seed data. Such a process is central to the notion of Continuous Database Integration (which happens to be the subject of chapter 5 in Addison Wesley’s forthcoming book entitled “Continuous Integration: Improving Software Quality and Reducing Risk“).

Using NAntContrib’s sql task, I proceeded to create two hip targets– one which ran a DDL script that dropped any existing tables and then re-created then. The second target ran a DML script, which then proceeded to populate the database schema with data. This is essentially an up front process that enables one to, say, deploy a fully functioning application for testing, in an automated fashion from within a build process. Lastly, I created an additional boss task, dubbed database-prepare, which called the first two tasks and acted as a handy facade.

Once completed, I was essentially left with a NAnt script that screamed in the face of the DRY principle as I had two tasks that did essentially the same thing. For example, below is a snippet of the offending build file:

<target name="database-prepare">
 <call target="database-create"/>
 <call target="database-load"/>
</target>

<target name="database-create">
 <echo message="Creating database definition with ${data-definitions}"/>
 <sql connstring="${project.db.conn}"
     delimiter=";" delimstyle="Normal"
     print="true" source="${data-definitions}"/>
</target>

<target name="database-load">
 <echo message="Loading database with data found in ${data-load}"/>
 <sql connstring="${project.db.conn}"
     delimiter=";" delimstyle="Normal"
     print="true" source="${data-load}"/>
</target>

At this point, because it’s my bag, I found myself feeling somewhat disquieted (of course, in a disco sort of way) longing for a more expressive form of build scripting, say with something like Boo. But I found myself not needing a script task, for example, but a more rich way of structuring a build– like what one finds in Rake. With more expressive build platforms (that don’t rely on XML) one can create reusable functions, for example, which is exactly what I wanted– a generic sql task that I could then pass in the desired load file at build time depending on the command.

In the end, rather than rewrite the entire build, I chose to force target reuse in NAnt via the property task. While not perfect, it works– the build file has a generic sql target that relies on a smokin’ property being set– if the property isn’t set, the target will fail. The way the build is structured, the only way to cause a failure is to call the generic target directly (to my knowledge, there isn’t a way to explicitly deny calling a target directly– private targets anyone?). I chose to name the target with a pythonic underscore to signify this target’s private-ness.

The API of the build is essentially still the same– there are three public targets, which correspondingly initialize the property to the proper DDL or DML file and then call the generic sql target as follows:

<property name="data-load" value="./sql/data-load.sql" />
<property name="data-definitions" value="./sql/data-definitions.sql" />

<target name="database-create">
 <property name="data-file" dynamic="true" value="${data-definitions}"/>
 <call target="_database-do"/>
</target>

<target name="database-load">
 <property name="data-file" dynamic="true" value="${data-load}"/>
 <call target="_database-do"/>
</target>

<target name="_database-do">
 <echo message="loading ${data-file}"/>
 <sql connstring="${project.db.conn}"
     delimiter=";" delimstyle="Normal"
     print="true" source="${data-file}"/>
</target>

<target name="database-prepare">
 <call target="database-create"/>
 <call target="database-load"/>
</target>

While not perfect, it certainly is more in line with what I’d do if I were coding this logic in C#, for example. A build file (or files) is one of the most important assets a project has– without an effective build, no matter how many tests you write or how perfect the code is, customers will have a hard time receiving working, reliable applications. So the next time you find yourself wincing at your build script, take the time to refactor it, man!

Is developer testing highway robbery?

There is a totally disco post over on the “Improve Your Java Code Quality” forum hosted by IBM developerWorks, which essentially asks: is the price of writing tests worth it? The poster writes

“I am frustrated very much, because I spent much more time on writing unit test code than on writing the business method implementation.”

Interestingly enough, his feelings are by no means unique– quality isn’t free, man. Testing isn’t easy, but I’m convinced it’s worth the pain for a number of reasons. What do you think?

The weekly bag– Jan 12

Parametric testing show down

One of my favorite features of TestNG is its hip parametric testing ability, which allows you to create generic test cases and then vary the test values– you can use parameters from XML files or even use the DataProvider feature for a more rich parameter type. In fact, for awhile, because it’s my bag baby, I’ve been espousing TestNG’s out of the box parametric testing features as a reason to use TestNG as a opposed to JUnit for higher level testing. JUnit 4 , however, now supports parametric tests– and you’ll find that its parametric testing is rather similar to TestNG’s DataProvider.

For example, in TestNG, if I’d like to create a generic test and vary its parameters, I have to do three things:

  1. Create a generic test whose parameters are the parameterized values
  2. Create a DataProvider method, which feeds the values for the test
  3. Link the DataProvider method to the test via the @Test annotation

Hence, I can create a copasetic generic test method as follows:

public void verifyHierarchies(Class clzz, String[] names)
 throws Exception{
  Hierarchy hier = HierarchyBuilder.buildHierarchy(clzz);
  assertEquals(hier.getHierarchyClassNames(), names, "values were not equal");
}

Then I can create a feeder method as follows:

@DataProvider(name = "class-hierarchies")
public Object[][] dataValues(){
 return new Object[][]{
   {Vector.class, new String[] {"java.util.AbstractList",
      "java.util.AbstractCollection"}},
   {String.class, new String[] {}}
  };
}

Note how I have to declare a name for DataProvider, in my case, I dub it class-hierarchies. I can now link the two methods by using the @Test annotation and setting the dataProvider value to the name of my feeder method:

@Test(dataProvider = "class-hierarchies")

That was fairly disco, no? JUnit 4 takes a somewhat similar approach that requires a bit more legwork though:

  1. Create a generic test that takes no parameters
  2. Create a static feeder method that returns a Collection type and decorate it with the @Parameter annotation
  3. Create class members for the parameter types required in the generic method defined in step 1
  4. Create a constructor that takes these parameter types and correspondingly links them to the class members defined in step 3
  5. Specify the test case be run with the Parameterized class via the @RunWith annotation

If I take the same code above and rework it to use JUnit 4 parameterizations, I first must create the generic test:

@Test
public void verifyHierarchies() throws Exception {
 Hierarchy hier = HierarchyBuilder.buildHierarchy(clzz);
 assertEquals("values were not equal",hier.getHierarchyClassNames(), names);
}

Second, I need to create a dynomite feeder method, which functions much like TestNG in that it requires an Object array with matching parameter types.

@Parameters
public static Collection hiearchyValues() {
 return Arrays.asList(new Object[][] {
  {Vector.class, new String[] { "java.util.AbstractList",
     "java.util.AbstractCollection" } },
  {String.class, new String[] {} } });
}

Note how this method is decorated with the @Parameter annotation, man. Next, because the parameters are of types Class and String[], I create two class members:

private Class clzz;
private String[] names;

Step 4 requires I create a constructor, which links values:

public HierarchyBuilderParameterTest(Class clzz, String[] names) {
 this.clzz = clzz;
 this.names = names;
}

Lastly, make sure you specify at the class level that this test be run with the Parameterized class like so:

@RunWith(Parameterized.class)

As you can see, JUnit makes you jump through a few more hoops, yet the fundamental requirements are quite similar to TestNG’s DataProvider feature. When push comes to shove, I still find TestNG’s semantics much simpler, but it’s nevertheless a disco feature to find in JUnit 4. Dig it?

Aggregate Cyclomatic complexity is meaningless

Recently, there have been a number of hip online discussions regarding code metrics and their associated value. There have been some excellent points made; however, because it’s my bag, I did notice an apparent misunderstanding when it comes to Cyclomatic complexity. This metric only has meaning in the context of a single method. Mentioning that a class has a Cyclomatic complexity of X is essentially useless.

Because Cyclomatic complexity measures pathing in a method, every method has at least a Cyclomatic complexity of 1, right? So, the following getter method has a CCN value of 1:

public Account getAccount(){
   return this.account;
}

It’s clear from this boogie method that account is a property of this class. Now imagine that this class has 15 properties and follows the typical getter/setter paradigm for each property and those are the only methods available. That means the class has 30 simple methods, each with a Cyclomatic complexity value of 1. The aggregate value of the class is then 30.

Does this value have any meaning, man? Of course, watching it over time may yield something interesting; however, on its own, as an aggregate value, it is essentially meaningless. 30 for the class means nothing, 30 for a method means something though.

The next time you find yourself reading a copasetic aggregate Cyclomatic complexity value for a class, make sure you understand how many methods the class contains. If the aggregate Cyclomatic complexity value of a class is 200– it shouldn’t raise any red flags until you know the count of methods. What’s more, if you find that the method count is low yet the Cyclomatic complexity value is high, you will almost always find the complexity localized to a method. Right on!

Next »