March 2008

Abecedarian Groovy

New to Groovy? Fear not, hip daredevil! While “Groovy in Action” is undoubtedly your vade mecum as it is for countless other madcaps, the copasetic folks at IBM developerWorks have recently published “Fluently Groovy” to quell your Groovy thirsts, baby.

This tutorial is for hip developers unfamiliar with Groovy and who want a quick and copasetic introduction to the basics, man. Get started with Groovy’s simplified variation of the Java syntax and learn about essential features like native collections and closures. Write your first Groovy class, and then see how easily you can use JUnit to test it, baby.

Because it’s your bag, you’ll walk (no, dance!) away from this one-hour tutorial with a fully functioning Groovy development environment and the skills to use it. Right on! Best of all, you’ll have learned first-hand how to use Groovy and Java code together in your everyday Java application development. How’s that sound for a party, baby?! What are you waiting for? Read on!

The weekly bag– March 21

Missed a week! Don’t fret, such a mishap doesn’t imply there weren’t any links hip enough to list. It was disco night at the local YMCA…

Unambiguously analyzing metrics

Software metrics are objective measurements of particular aspects of code– for instance, Cyclomatic complexity measures complexity without any regard for why code contains a certain number of paths. For metrics to be useful, baby, they must be applied subjectively. In the case of complexity, there may be circumstances that warrant such code (although, I’ve yet to find complex code that still can’t benefit from refactoring). I’ve also found that, on the whole, metrics are more copasetic when combined with other metrics and trended– for instance, complexity alone is somewhat interesting, but pairing complexity with code coverage paints a much more detailed metric that bears understanding. High complexity with low coverage is clearly more risky than the same complexity with high code coverage– even the CRAP metric holds this relationship.

One particular hip metric that I find helpful is the ratio of copy and pasted code within a code base as unknown copy and pasted code will haunt you, man. For instance, copy and pasted code replicates bugs and poorly coded algorithms to name a few nefarious aspects; consequently, understanding what code has been replicated can help teams refactor offending code. Having run various copy and paste analyzers on more code bases than I care to admit, (and because it’s my bag) I’ve found that all code bases have a certain level of offending code that triggers a copy and paste detection. One particular tool, CPD, is nice enough to create a report containing the offending code like so:

<duplication lines="7" tokens="53">
 <file line="36" path="cbd4/blackjack/src/com/stelligent/blackjack/Hand.java"/>
 <file line="42" path="cbd4/blackjack/src/com/stelligent/blackjack/Hand.java"/>
 <file line="48" path="cbd4/blackjack/src/com/stelligent/blackjack/Hand.java"/>
 <codefragment>
 <![CDATA[
    } else if (first.equals("Jack")) {
      if(!second.equals("King") && !second.equals("Queen") && !second.equals("Jack")){
         return Integer.parseInt(second) + 10;
     } else {
      return 20;
    }
   }else if(first.equals("9")){
]]>
</codefragment>
</duplication>

As you can see, CPD reports the total number of lines of copy and pasted code and where that bogue code can be found. This data is certainly helpful; however, it doesn’t paint the entire story– while 7 lines doesn’t seem like all that much code, you’d probably reconsider if it were 7 lines of code in a 30 line code base or more realistically– 700 lines in a few thousand line code base. Therein lies the catch– CPD’s data is really only helpful when viewed on the whole (or a ratio– that is, total lines of copy and pasted code over total lines of code). Unfortunately, CPD doesn’t report the total lines of code scanned– only the total lines of copy and pasted code. For instance, in this sample code base, there were 9 suspected copy and pasted code fragments totaling about 120 lines of code (or CPLOC).

Luckily, there’s another handy tool which reports the total lines of code (or LOC) in a code base– JavaNCSS. Running JavaNCSS yielded a value of about 610 LOC; therefore the ratio of copy and pasted code is CPLOC/LOC or 120/610, which is roughly 20%.

20% CPLOC is probably a bad thing– at a minimum is is worth knowing about. 20% today might not be too important to know, but knowing that it increased to 25% next week would be an indication that things are degrading– likewise, seeing a value decrease over time indicates the code base is actively being improved. Yet, how can teams possibly monitor this trippin’ data?

Reports are hip, but in truth, reports by tools like CPD are essentially read once– the first time they are generated. After that, it’s anyone’s guess when someone will actively read the report again. Hence, I find it particularly helpful to essentially throw the report out and let the build itself proactively tell me when a particular metric gets out of hand. This essentially means that my build has to monitor a particular metric– and in the case of the CPLOC ratio, my build has to gather data from two sources– JavaNCSS’s report and CPD’s.

Fortunately, this is easy with Groovy– if your instance, you are using Ant for builds, you can first generate the two reports as follows:

<target name="cpd">
 <mkdir dir="target/reports"/>
 <taskdef name="cpd" classname="net.sourceforge.pmd.cpd.CPDTask"
   classpathref="classpath"/>
 <cpd minimumTokenCount="10" outputFile="target/reports/cpd.xml" format="xml">
  <fileset dir="src">
   <include name="**/*.java"/>
  </fileset>
 </cpd>
</target>

The code above generates a CPD XML report from all the code in a src directory and the following code creates a JavaNCSS report from the same code base:

<target name="javancss">
 <taskdef name="javancss" classname="javancss.JavancssAntTask"
   classpathref="classpath" />
 <javancss srcdir="src" generateReport="true"
   abortOnFail="true" ccnPerFuncMax="100"
   outputfile="target/reports/javancss_metrics.xml" format="xml" />
</target>

The only high-level step left to do is to put the two metrics together; however, this step actually takes a few sub-steps, baby. For instance, obtaining the total lines of CPLOC requires iterating over a collection of duplication elements in the CPD xml file. Consequently, the following steps detail the effort required to obtain this metric:

  • parse the JavaNCSS xml report and obtain the total LOC
  • parse the CPD xml report and obtain the total CPLOC
  • divide the two and compare the result to some threshold
  • if the threshold is exceeded, fail the build

Groovy, by the way, is particularly well suited for such a task (as if you didn’t know that, man?)– parsing XML with Groovy is practically effortless– like disco dancing, eh? For instance, obtaining the total LOC from JavaNCSS’s xml file is as easy as

int ncss = Integer.parseInt(jncssroot.packages.total.ncss.text())

Note, I’m coercing integer values as I’d like to divide (and round) my result– if I don’t explicitly specify int’s I’ll be left with String division, which doesn’t work so well.

Parsing CPD’s xml document is slightly more complex– slightly in that it takes 3 times as much code:

def cpdtot = 0
cpdroot.duplication.each { elem ->
 cpdtot += Integer.parseInt(elem.@lines.text())
}

Again, parsing an XML document yields String values; accordingly, I need to use Integer’s parseInt method.

Next, all I need to do is divide the two and, in my case, I’m aggressively rounding up via Java’s ceiling call as follows:

def ratio = Math.ceil((cpdtot / ncss) * 100)

Multiplying the result by 100 gives me a percentage value, of course, and lastly, I compare that to a threshold value:

if(ratio > Double.parseDouble(properties.cpd_threshold)){
  ant.fail(message:
   "cut and paste ratio was greater than ${properties.cpd_threshold}%, it was ${ratio}%")
}

Puttin’ it all together, baby, yields a groovy Ant script with a hip target:

<target name="cpd-threshold" depends="metrics">
 <groovy>
 def jncssroot = new XmlSlurper().parse("target/reports/javancss_metrics.xml")
 int ncss = Integer.parseInt(jncssroot.packages.total.ncss.text())

 def cpdroot = new XmlSlurper().parse("target/reports/cpd.xml")

 def cpdtot = 0
 cpdroot.duplication.each { elem ->
  cpdtot += Integer.parseInt(elem.@lines.text())
 }

 def ratio = Math.ceil((cpdtot / ncss) * 100)

 if(ratio > Double.parseDouble(properties.cpd_threshold)){
  ant.fail(message:
   "cut and paste ratio was greater than ${properties.cpd_threshold}%, it was ${ratio}%")
 }
 </groovy>
</target>

Reports are hip, but they are usually only read once– the first time they are generated. Rather than waiting to find out that there’s a problem, proactively analyzing a hip metric (such CPLOC/LOC) enables rapid feedback and rapid corrections– is that unambiguous or what?

The weekly bag– March 7

Sorry to have missed the leap year weekly bag, baby. Without further ado:

Unadulterated Java is so groovy

Groovy is an alternate language for the JVM– alternate in that Groovy is a simpler, more expressive Java (which, by the way, also happens to work with normal Java). In fact, if you already know Java, you basically know Groovy, man.

definition

Groovy’s syntax is a less strict version of Java’s and adds a few new features here and there. You could say that Groovy’s syntax is terse, which yeilds a highly expressive medium for conveying behavior without a lot of extranous verbiage. But the beauty is that that verbiage isn’t gone– it’s assumed. Hence, Groovy is an unadulterated version of Java, baby.

As a demonstration, here is a hip Java class that represents a song, aptly named Song. As you can imagine, this code exists in a file named Song.java and is located in some sort of package structure (like com.acme.blah).

public class Song {
 private String name;
 private String genre;
 private String artist;

 public String getName() {
  return name;
 }

 public void setName(String name) {
  this.name = name;
 }

 public String getGenre() {
  if(genre != null){
   return genre.toUpperCase();
  }else{
   return null;
  }
 }

 public void setGenre(String genre) {
  this.genre = genre;
 }

 public String getArtist() {
  return artist;
 }

 public void setArtist(String artist) {
  this.artist = artist;
 }
}

This bogue class doesn’t do anything particularly interesting and is basically a JavaBean– but it serves as an excellent demonstration of how you can achieve the same behavior (albeit simple) in Groovy with fewer lines of code.

As a first step, you can make this a Groovy class by just changing the file’s type to .groovy– in fact, Song.groovy as is, will compile just fine with groovyc.

One thing about Groovy is that it is Java without accessibility modifiers (i.e. you don’t have to specify public– everything is assumed to be so unless you say otherwise) and semi-colons. Consequently, I can trim down this class somewhat by removing all semi-colons and public modifiers, baby.

class Song {
 private String name
 private String genre
 private String artist

 String getName() {
  return name
 }

 void setName(String name) {
  this.name = name
 }

 String getGenre() {
  if(genre != null){
   return genre.toUpperCase()
  }else{
   return null
  }
 }

 void setGenre(String genre) {
  this.genre = genre
 }

 String getArtist() {
  return artist
 }

 void setArtist(String artist) {
  this.artist = artist
 }
}

Already, because it’s my bag, Song is getting a bit shorter, but nothing to write home about.

What’s particularly handy in Groovy is the way it treats properties– in this case, Song has 3 (name, genre, and artist). By convention, in Java, if you want to access these values (i.e. via getters and setters) you copasetically create corresponding setProperty and getProperty methods. In Groovy, these accessors are generated for you; consequently, in the Song class, I can remove those methods leaving me with the following code:

class Song {
 private String name
 private String genre
 private String artist

 String getGenre() {
  if(genre != null){
   return genre.toUpperCase()
  }else{
   return null
  }
 }
}

Note that I left in the getGenre method– that’s doing something special.

Be careful though, disco dancers, as in Groovy, if I leave the properties as private Groovy will not generate the accessors– accordingly, the next step is to remove the private modifier on the properties.

class Song {
 String name
 String genre
 String artist

 String getGenre() {
  if(genre != null){
   return genre.toUpperCase()
  }else{
   return null
  }
 }
}

Looking at this code, I can hear the Java faithful grinding their teeth over what appears to be a lack of encapsulation, baby– in fact, in Groovy, you can access the name property directly!

Or do you?

Using a simple GroovyTestCase, I can knock out a quick test to see property access in action:

class SongTest extends GroovyTestCase {

 void testPropertyAccess() {
  def sng = new Song(artist:"Lipps, Inc",
      name:"FunkyTown", genre:"Disco")

  assertEquals("FunkyTown", sng.name)
  assertEquals("FunkyTown", sng.getName())
 }
}

In the code above, I’m grabbing a property either directly or via the getter method– the same, by the way, is true for setting a value. But the question remains, when the property is seemingly accessed directly, does this by pass the getter (or setter)?

One way to find out is to use the getGenre method, right? It does something special– accordingly, the following test case demonstrates hip encapsulation in action:

void testEncapsulatedAccess() {
 def sng = new Song(artist:"Lipps, Inc",
   name:"FunkyTown", genre:"Disco")

 assertEquals("DISCO", sng.genre)
 assertEquals("DISCO", sng.getGenre())
}

Even though the properties in the Song class are not explicitly private, they are acting privately, aren’t they, man?

Going back to the getGenre method on the Song class, it turns out that Groovy also has a handy syntax for null pointer safety, consequently, I can simply that method even further.

class Song {
 String name
 String genre
 String artist

 String getGenre() {
  return genre?.toUpperCase()
 }
}

Groovy also permits dropping return statements– in essence, the last line of a method is assumed as the return value. So, that leaves the getGenre method as:

String getGenre() {
  genre?.toUpperCase()
}

Groovy infers types– for instance, writing String value = "groovy"; is a bit verbose, no? Think about it for a second– String value is superfluous isn’t it? value must be a String as I set it to one, right? Likewise, in Groovy, value is clearly a String (because it is set to a String directly!) without having to give the compiler (or runtime) a hint. Accordingly, you can drop types and replace them with Groovy’s hip def keyword.

class Song {
 def name
 def genre
 def artist

 def getGenre() {
  genre?.toUpperCase()
 }
}

Removing an explicit type does have some consequences though– in this case, given that Groovy is Java, these properties must have some type associated with them. What do you think that type is? Right! java.lang.Object, baby. Consequently, if you were to use this Groovy Song class in Java, the getGenre method will have a return type of Object, not String (unless you explicitly make String the return type).

Applying groovy techniques to Java yields a more effective Java– in this case, the Song class started out at roughly 30 lines of code. Refactoring it down a bit by leveraging Groovy, I ended up with basically 10 lines of code. The code is essentially the same– the only behavioral difference being that Object is now king with method return types (should you use this class in normal Java). But switching out def for String doesn’t add any new lines of code, does it?

Groovy is unadulterated Java– in fact, it’s Java without a lot of Java. Can you dig it, baby?