Chewing the fat over cyclomatic complexity

The hip folks at Enerjy talked with a copasetic crowd recently asking their thoughts on cyclomatic complexity. It seems most (including this disco dancer) find that CCN is an excellent indicator of risk– I haven’t found a better metric yet– in fact, a recent addition to the hip metrics crowd, dubbed C.R.A.P, which was donated by the folks at Agitar, even builds upon CCN by combining code coverage. What do you think, man?

Related odds and ends
 

14 Responses to “Chewing the fat over cyclomatic complexity”

  1. on 13 Feb 2008 at 11:34 pm Andrew Binstock

    CCR is not a great metric IMHO and I think Agitar’s use of it in Crap4J is an error. The big problem with CCR is that it counts each case of a switch statement as a +1 for complexity. So the following code, which is hard to make simpler, gets a rating equal to the total number of case statements–so 110–which obviously has nothing to do with its complexity:

    switch( temperature ) {
    case 108: lowerTemp( 2 ); break;
    case 107: lowerTemp( 2 ); break;
    case 106: lowerTemp( 1 ); break;
    case 105: lowerTemp( 1 ); break:
    ….
    case 0: soundAlarm( 0 ); break;
    default: soundAlarm( i ): break;
    }

    The only way to get around this in CCR is to *add* complexity by breaking this code into multiple switch statements; which makes no sense, but makes CCR and Crap4J completely happy. Since Crap4J has to scan the code anyway, why does it not compute complexity correctly rather than use CCR. This way it can recognize irreducibly simple code for what it is and not create false positives for managers and induce counterproductive practices in developers. Crap4J clearly needs to fix this.


  2. [...] Original post by Andy [...]

  3. on 14 Feb 2008 at 3:05 am Chris

    Seems like the above code *is* problematic and complex. Multiple switch paths, even if the target is “simple” still results in a large number of possible paths through the code. Getting each of those paths correct is non-trivial. Each switch is just another path that someone can screw up. If I had a module like that… hell yes I would want to know about it! I effectively have 108 + 1 + 1 cases where someone or something can foul up.

  4. on 14 Feb 2008 at 5:00 am Andrew Binstock

    @Chris: You’re missing the point. The point of CCR and Crap4J is to identify code that should be fixed because it’s too complex. This code cannot be made simpler: converting to “if statements” doesn’t do it, etc. Even the Command pattern won’t do it. There is nothing to do to lower the compelixity, consequently measures like Crap4J that use magic numbers derived from CCR generate incorrect result. For example, in Crap4J, code is called crap if its CCR is > 31, regardless of how many tests you write for it.

    When a metric falls apart on legitimate code and tells you that you need to simplify it when it cannot be simplified, you have a flawed metric.

    If you as a manager want to know about 30+ cases in switch statements, there are tools that can tell you this (Checkstyle and others). That is not the function of a complexity measuring metric.

  5. on 14 Feb 2008 at 4:06 pm Chris

    In this case a table of temperatures to reduce keyed off the temperature would be a better, more maintainable solution.

    Handed a method with over 30 branch points *is* a problem. Even if you test every line 100% (which CRAP takes into account) the sheer amount of branch points means:

    1) A huge number of test cases to hit 100% line & branch coverage
    2) A high amount of refactoring if the the assumption this code is based on changes

    The metric here would say, high complexity with 100% test coverage… ok so far so good. Then it says… you know what… that amount of complexity *may not be enough to say the code is buggy but chances are good it can and will become buggy later on when maintenance of the code occurs*.

    The above example is almost like an example pulled straight out of “Refactoring to Patterns” [Fowler]. It *can* be made more maintainable.

  6. on 15 Feb 2008 at 8:20 am Andrew Binstock

    If different values in the switch statement call different methods, as in my example, a table is a non-starter. You could use a map and the command pattern. However, the command pattern does not reduce complexity, it simply redistributes the complexity throughout your project–and all the new objects need tests of their own. Whereas a single class with a single switch statement is completely clear and much easier to verify. (Agreed that if the case statements become more than simple do-this, do-that commands, then you have to abandon the switch.)

    Anyhow, it would be more fun to have this dialogue over a beer. (I’ll be at CITcon and JavaOne.) My original point is that CCR’s measure of complexity is flawed in how it counts switch statements. Simple case statements are surely less complex that the equivalent number of if/else statements, so I think they should have a lower level of complexity.


  7. [...] Chewing the fat over switch statements and CCN- Andrew Binstock and Chris have a copasetic discussion going on through comments regarding CCN and switch statements. This is a great read! [...]

  8. on 15 Feb 2008 at 6:18 pm Rich

    Chris wrote:

    “..that amount of complexity *may not be enough to say the code is buggy…”

    Our follow up blog to the video we posted on different peoples views, proves that it is indeed probable that, knowing nothing else about a file, having a higher McCabe value is enough to be fault-prone.

    Of course nothing in this industry is certain…

  9. on 15 Feb 2008 at 7:33 pm Andrew Binstock

    Your analysis in your original post on CCN is more accurate than the statement you make here. In fact, if we look at the chart you generated, (http://www.enerjy.com/blog/wp-content/uploads/2008/02/mccabegraph.jpg) very low CCN counts have a *higher* bug expectancy up to a CCN of 11. CCNs of 1-11 cover a large fraction of Java routines. Moreover, up to a CCN of 25 or so, there is no correlation between CCN and bug frequency. A higher CCN value has equal probability of raising or lowering bug counts in that range.

    And since my experience is that the vast majority of Java methods have CCN of < 25, I think it’s more accurate to say that for the majority of Java code, CCN in fact is not a useful indicator of quality–nor of how many unit tests you should be writing.

  10. on 16 Feb 2008 at 5:28 am Andrew Binstock

    @Rich: The analysis in your original post on CCN is more accurate than the statement you make here. In fact, if we look at the chart you generated, (http://www.enerjy.com/blog/wp-content/uploads/2008/02/mccabegraph.jpg), low CCN counts have a *higher* bug expectancy up to a CCN of 11.

    Moreover, up to a CCN of 25 or so, there is no correlation between CCN and bug frequency. A higher CCN value has equal probability of matching higher or lower bug counts in that range.

    And since my experience is that the *vast* majority of Java methods have CCN of < 25, I think it’s more accurate to say that for the majority of Java code, CCN in fact is not a useful predictor of quality–nor of how many unit tests you should be writing. Above 25, as you state in the original post, CCN does correlate to an increasing bugginess.

  11. on 18 Feb 2008 at 12:00 pm Stephan Schmidt

    The switch statement is problematic, I would probably use a simple rule engine for that case.

    Your view is valid though I think. Different, but equivalent code, has different CCNs.

    case 1: lowerTemperature(3); break;
    case 2: lowerTemperature(2); break;

    has a higher CCN than (Simple doItYourself Java rule engine):

    on(1).adjust(lowerTemperature(3));
    on(2).adjust(lowerTemperature(2));

    although both fragments do the same and perhaps, have the same complexity.

    Peace
    -stephan

  12. on 18 Feb 2008 at 12:05 pm Stephan Schmidt

    ( The on() rule engine code is better though, it can check for errors and it’s easier to debug, it could draw a diagramm or print itself nicely to make debugging easier. And there is no problem with break and people can only call correct commands, not all legal Java code. So perhaps CCN is an indicator that there is a problem with the switch code )

    Peace
    -stephan

  13. on 22 Feb 2008 at 2:39 pm Thomas McCabe Jr.

    Hello All,

    I have been aware of the CRAP metric for a while now but have only taken a cursory look, and therefore have not formulated an opinion on the subject. However, I have seen organizations obtain value from the McCabe cyclomatic complexity metric, not only on its own but also in conjunction with other metrics. For example, metrics such as one which gives the ratio of cyclomatic complexity to SLOC, the essential density (ratio of unstructured paths to cyclomatic), the design complexity (cyclomatic paths related to calls to other modules), and others.

    Cyclomatic complexity measures control flow. The reason why case and switch statements add to cyclomatic complexity is because you do indeed need to test the control flow associated with the case and switch statements.

    However, the one acknowledged exception to limiting the cyclomatic complexity to ten (10) are modules that include a switch or case statement (some can be very large). If you have perfectly structured switch and case statements (and also avoid jumping into loops/decisions and out of loops/decisions within the rest of the logic) the McCabe essential complexity will always be one (1) indicating a perfectly well structured module. McCabe essential complexity is bounded by the cyclomatic complexity and guides in modularizing or partitioning high cyclomatic modules into smaller more understandable pieces.

    Frequently, software engineers obtain cyclomatic complexity values to understand the complexity level, but do not then use it as part of the total Structured Testing Methodology as described in the NIST document (http://www.mccabe.com/iq_research_nist.htm). Being cognizant of cyclomatic paths enables one to catch errors that deal with the interactions of code constructs, which results in higher reliability. Some of the bloggers are hitting on this thought - that the cyclomatic complexity (even as it is incremented for all outcomes of a switch outcome) can at least help you be aware of all paths, so you can make more informed decisions about your testing effort. More info on this subject can be at

    http://www.mccabe.com/iq_research_appNotes_1.htm.

  14. on 04 Jun 2008 at 4:48 pm Software Company

    hey

    Really nice tips!

    will try to keep it in my mind

    thanks!

Trackback this Post | Feed on comments to this Post

Leave a Reply