Magic Numbers

Studies have shown that the average person has the capacity to handle about seven pieces of data in their head, plus or minus two [ref]. Hence, we can easily remember phone numbers, but most of us have a slightly more difficult time memorizing credit card numbers, groovy bellbottom widths, launch sequences, boss disco moves, etc.

This principle applies to understanding code. Everyone has probably seen snippets of code like this:

if (entityImplVO != null) {
  List actions = entityImplVO.getEntities();
  if (actions == null) {
     actions = new ArrayList();
  }
  Iterator enItr = actions.iterator();
  while (enItr.hasNext()) {
    entityResultValueObject arVO = (entityResultValueObject) actionItr
     .next();
    Float entityResult = arVO.getActionResultID();
    if (assocPersonEventList.contains(actionResult)) {
      assocPersonFlag = true;
    }
    if (arVL.getByName(
      AppConstants.ENTITY_RESULT_DENIAL_OF_SERVICE)
         .getID().equals(entityResult)) {
      if (actionBasisId.equals(actionImplVO.getActionBasisID())) {
        assocFlag = true;
      }
    }
    if (arVL.getByName(
     AppConstants.ENTITY_RESULT_INVOL_SERVICE)
      .getID().equals(entityResult)) {
     if (!reasonId.equals(arVO.getStatusReasonID())) {
       assocFlag = true;
     }
   }
 }
}else{
  entityImplVO = oldEntityImplVO;
}

There are arguably 9 different paths shown there. Incidentally, this snippet is part of a larger 350 plus line method, which was shown to have 41 distinct paths. Imagine if you were tasked to modify this method for the purposes of adding a new feature. If you didn’t write this method, do you think you could make the requisite changes without introducing a defect? Keep on truckin’, man.

Of course, you’d write a test case, but do you think your test case could isolate your particular change in that sea of conditionals?

Cyclomatic complexity, which was pioneered in the Age of Disco (!!), precisely measures this path complexity. By counting the distinct paths through a method, this integer based metric aptly depicts method complexity. In fact, various studies over the years have determined that methods having a cyclomatic complexity (or CC) greater than 10 have a higher risk of defects [ref].

Because CC represents the paths through a method, this is an excellent number for determining the number of test cases required to reach 100% coverage of a method. For example, the following not-so-copasetic method has a logical defect.

public class PathCoverage {
  public String pathExample(boolean condition){
     String value = null;
      if(condition){
          value = " " + condition + " ";
      }
      return value.trim();
   }
}

One test can be written which, interestingly enough, achieves 100% line coverage.

import junit.framework.TestCase;

public class PathCoverageTest extends TestCase {
    public final void testPathExample() {
    	PathCoverage clzzUnderTst = new PathCoverage();
        String value = clzzUnderTst.pathExample(true);
        assertEquals("should be true", "true", value);
    }
}

Running a code coverage tool, such as Cobertura yields the following report:


Cobertura coverage report

This method has a CC of 2 (one for the default path and one for the if path). Using CC as a more precise gauge of coverage implies a second test case is required. In this case, it would be the path taken by not going into the if condition as shown below.

public final void testPathExampleFalse() {
   PathCoverage clzzUnderTst = new PathCoverage();
   String value = clzzUnderTst.pathExample(false);
   assertEquals("should be false", "false", value);
}

Of course, running this new test case yields that nasty NullPointerException.

Luckily, the method under test in this case only has a CC of two. Imagine if that defect was buried in a method with a CC of 102. Good luck finding it, man. Unfortunately, I routinely run across code with CCs in the 100s too.

Because CC is such a good indicator of code complexity, there is a strong relationship between test driven development and low CC values. When tests are written often (note, I’m not implying first), developers have the tendency to write uncomplicated code because complicated code is hard to test. If one finds they are having difficulty writing a test it’s a red flag that the code under test may be complex. The short code, test, code, test cycle invites refactoring in these cases, which continually drives the development of un-complex code.

By determining the CC of class methods in a code base and continually monitoring these values, development teams can keep tabs on code complexity and take appropriate actions to address complexity issues as they arise. Dig it?

For more information on CC and how to refactor high CC code, check out OnJava’s Code Improvement Through Cyclomatic Complexity (written by yours truly).

Post to Twitter

Related odds and ends
 

5 Responses to “Magic Numbers”

  1. on 24 Feb 2006 at 1:06 pm Joe Ponczak

    The interesting issue to me is that Cobetura reported 100% branch coverage based on 1 JUnit test, even though the ‘if’ statement clearly has two branches.

    Also, as the number of decisions increase (more than 3), the value of branch coverage decreases. 100% branch coverage, in many cases, provides a false sense of security. Basis-path coverage, derived by the number of executed cyclomatic paths, is a much better measurement because it looks at the relationships between decisions.

  2. on 24 Feb 2006 at 1:29 pm john miller

    I dig it! Some people use branch coverage over line coverage, but cyclomatic path coverage is even better than that because it tests the different decisions in the method independently of one another, which is what branch coverage misses. Imagine a method composed of two separate IF methods. You can achieve 100% branch coverage by running two tests: one that makes both decisions evaluate TRUE and one that makes them both evaluate FALSE. But you could be skipping over a bug that exists on the TRUE-FALSE path.


  3. [...] Automating code inspections with analysis tools handles 80% of big picture and allows humans to intervene in the 20% that matters. For instance, Java’s PMD will run 180+ rules against a file every time it changes. If a particularly concerning rule is violated, such as a high cyclomatic complexity value, someone can take a look. Can you imagine trying to accomplish this process manually? Why would someone want to? That’s so establishment! [...]


  4. [...] In fact, a number of studies in the early days of computing (before the Golden Age of Disco) did show a correlation between the number of paths through code and defects. One such metric that arose from these studies was Cyclomatic Complexity. This integer based metric precisely measures complexity by counting the distinct paths through a method; moreover, various studies over the years have determined that methods having a cyclomatic complexity (CC or sometimes referred to as CCN) value greater than 10 have a higher risk of defects. [...]


  5. [...] therein lies the most telling metric that a code coverage report can covey — that which isn’t [...]