March 2006
Monthly Archive
Monthly Archive
Code metrics can be useful if they are applied correctly. Productivity, for example, is an area where code metrics have been abused. These valuable measurements, however, can be effective in objectively spotting complexity. Since complexity usually correlates to defects, doesn’t it make sense then to measure and track complexity within a code base?
The third article in IBM developerWorks’ hip series “In Pursuit of Code Quality” explores Cyclomatic Complexity and how one can obtain it via JavaNCSS. Check out “Monitoring cyclomatic complexity” and if you have any thoughts, questions, or concerns, check out the “Improve Your Java Code Quality” forum!
2 comments Friday 31 Mar 2006 | Andy | Articles
Just because a framework claims to be a JUnit extension doesn’t necessarily mean it can’t be used within TestNG. So long as the target framework:
it can be easily incorporated into TestNG.
For example, XMLUnit can easily be used with TestNG- all that’s required is a configuration annotation on a setUp style method which will configure various copasetic parsers for XMLUnit.
/**
* @testng.configuration beforeTestClass = "true"
*/
protected void configure() throws Exception {
XMLUnit.setControlParser(
"org.apache.xerces.jaxp.DocumentBuilderFactoryImpl");
XMLUnit.setTestParser(
"org.apache.xerces.jaxp.DocumentBuilderFactoryImpl");
XMLUnit.setSAXParserFactory(
"org.apache.xerces.jaxp.SAXParserFactoryImpl");
XMLUnit.setIgnoreWhitespace(true);
}
Notice how I’ve signified this fixture to be run once, rather than the establishment one-for-every-test-case JUnit standard.
Now I can test an application’s generated XML with abandon!
/**
* @testng.test
*/
public void assertToXML() throws Exception{
BatchDependencyXMLReport report =
new BatchDependencyXMLReport(new Date(9000000),
this.getFilters());
report.addTargetAndDependencies("com.vanward.test.MyTest",
this.getDependencies());
report.addTargetAndDependencies("com.xom.xml.Test",
this.getDependencies());
Diff diff = new Diff(new FileReader(
new File("./test/conf/report-control.xml")),
new StringReader(report.toXML()));
Assert.assertTrue(diff.identical(),"XML was not identical");
}
XMLUnit isn’t the only JUnit extension that one can use within TestNG- DbUnit, JWebUnit, and JUnit-addons are just a few of the many available ones to choose from!
If the fear of not being able to use a JUnit extension is holding you back from giving TestNG a try, look closely on how the extension was designed for extensibility. Dig it?
1 comment Monday 27 Mar 2006 | Andy | Developer Testing, TestNG
Ever noticed that objects that have a lot of dependencies on other objects become somewhat brittle? If one of their dependencies changes, does the object itself break, man? Ever noticed the opposite? Have you over observed objects that every other object in a system depends on have the tendency to create issues elsewhere when they’ve changed? (Interestingly, this tendency to create issues elsewhere is commonly referred to as the “Collateral Damage†effect.)
There are a series of hip coupling metrics that are known as Afferent Coupling and Efferent Coupling (or sometimes referred to as Fan Out and Fan In). These integer based metrics represent a count of objects that correspond to coupling to another object. Both Afferent Coupling and Efferent Coupling signify an architectural maintenance issue: either an object has too much responsibility (high Afferent Coupling) or the object isn’t independent enough (high Efferent Coupling).
Interestingly, these types of dependency metrics can be extremely helpful in determining the risk in maintaining a code base. Objects or namespaces/packages with too much responsibly present dangerous situations when those objects need to be changed. If, some how, their behavior changes, other objects within a software system may stop functioning as indented. Objects that are highly dependent on other objects present brittleness in the face of change- they too may stop functioning as indented if one of their imported objects changes, even in subtle ways.
What’s more, both Afferent and Efferent Coupling can be combined to form an “instability†ratio. For example, the following smokin’ equation can represent an object’s (or namespace/package’s) level of instability in the face of change: a value of one is completely instable, while a value of zero is stable.
![]()
This Instability metric can apply to objects or packages/namespaces as well.
NDepend for the .NET platform is a trippin’ open source project, which reports Efferent Coupling, Afferent Coupling, Instability and a number of other interesting architectural metrics. These metrics are reported by Assembly and by class. The tool is easily executed via NAnt and produces reports in both XML and HTML formats.
For example, the following HTML report displays metrics for a .NET assembly, which in this case is the NUnit framework.
Note how the nunit.framework assembly has an Afferent Coupling of 140 and an Efferent Coupling of 0. This means that this is the core code of the NUnit framework (if you couldn’t figure that out by its name); however, this also means this code can’t change easily. Hence, the Instability value for this assembly is zero- because so many other objects depend on this core code there is a natural tendency for this code not to change without something breaking quickly.
On the other hand, note the nunit.tests assembly. NDepend reported a Efferent Coupling value of 76 and an Afferent Coupling value of 0; therefore, the Instability is 1 or unstable- this makes sense- anytime code changes, tests usually break (if they don’t, by the way, there could be issues with those tests).
Understanding these metrics for one’s code base can have dramatic effects on maintainability. For instance, assemblies with high Afferent Coupling should have a high degree of associated tests to ensure proper functionality. If a lot of code is dependent on that assembly, wouldn’t you want to guarantee it is reliable? Also too, evaluating the long term implications of Afferent Coupling could drive teams to decide to break assemblies into smaller more malleable chunks of code.
On the other hand, assemblies with a high Efferent Coupling are subject to breakage easily. Again, having a copasetic amount of code coverage regarding these assemblies will help teams spot troubles quickly.
In a Continuous Integration environment, monitoring these values over time can enable development teams to make corresponding changes rapidly before things get out of control. If during a series of builds the Afferent Coupling of an assembly grows rapidly, the following measures can be taken:
In addition to NDepend for .NET, JDepend is a similar open source project for the Java platform, which reports coupling metrics by package. JDepend can be run via Ant or Maven and also produces reports in XML and HTML formats.
Architectural coupling metrics can effectively spot long term maintenance issues for a code base by reporting assembly/package or object couplings. These metrics can provide insights into any associated risks in the face of change. What’s more, monitoring these metrics on a regular basis in a Continuous Integration environment effectively brings these risks to light before they become maintenance nightmares. Dig it?
3 comments Tuesday 14 Mar 2006 | Andy | Code Metrics, Continuous Integration, Developer Testing
Have you ever noticed that long methods are sometime hard to follow? Ever had trouble understanding the logic in an excessively deeply nested conditional? Your copasetic instincts for eschewing this type of code are correct. Long methods and methods with a high number of paths are hard to understand and, interestingly enough, tend to correlate to defects.
In fact, a number of studies in the early days of computing (before the Golden Age of Disco) did show a correlation between the number of paths through code and defects. One such metric that arose from these studies was Cyclomatic Complexity. This integer based metric precisely measures complexity by counting the distinct paths through a method; moreover, various studies over the years have determined that methods having a cyclomatic complexity (CC or sometimes referred to as CCN) value greater than 10 have a higher risk of defects.
JavaNCSS is an excellent tool which determines the lengths of methods and classes by examining Java source files. What’s more, this tool also gathers the cyclomatic complexity of every method in a code base. By configuring JavaNCSS either through its Ant task or via a Maven plug-in, an XML report is generated which lists:
The tool ships with a few hip style sheets which can then be used to generate an HTML report summarizing the data. For example, the HTML report generated by Maven has a section section labeled “Top 30 functions containing the most NCSSâ€, which details the largest methods in the code base, which incidentally almost always correlate to methods containing the highest cyclomatic complexity. For instance, this report lists the class DBInsertQueue‘s updatePCensus method as having a length of 283 and a cyclomatic complexity (labeled as CCN) of 114.
The question now becomes “so what?â€
Because high cyclomatic complexity values tend to correlate to defects, the first course of action is to verify the existence of any corresponding tests. If there are any tests, how many are there? A neat-o rule of thumb when dealing with cyclomatic complexity is in order to obtain full coverage of a method, one would need an equal number of test cases to the cyclomatic complexity value (i.e. in the example above, to achieve full coverage of the updatePCensus method, 114 test cases would be required). All but the most rare code base would actually have 114 test cases for this method, however, having a few is a great start in reducing the risk of this method containing defects.
If there aren’t any associated test cases, there is an obvious need to test this method (if it turns out there is a desire to truly lower the risk of defects). Some may immediately think it’s time to refactor; however, that would break the first rule to refactoring: write a test case. Once test cases are in place, lower risk refactoring can occur. In the case of cyclomatic complexity, the most effective way to reduce it is to apply the extract method technique and push the complexity into smaller, more manageable and therefore. more testable methods. Of course, then those smaller methods should then be tested.
In a Continuous Integration environment, evaluating this method’s complexity over time becomes possible. For instance, having run the report for the first time, this method’s complexity value can be monitored for any associated growth. If this is the case, appropriate action can be taken.
If a method’s CCN value keeps growing, teams can do a number of things:
Because JavaNCSS also reports on documentation trends, these values can be monitored for organizational documentation standards. Interestingly enough, the tool reports single line comments and multi-line comments in addition to JavaDocs. In some software circles, the mere presence of a high amount of inline code comments can be considered an indication of complexity.
JavaNCSS isn’t the only trippin’ tool available to the Java platform which can facilitate complexity reporting. PMD, another open source project, which analyzes Java source files, has a series of rules which report on complexity, including cyclomatic complexity, long classes and long methods. CheckStyle is another open source project with similar rules as well. Both PMD and CheckStyle also have Ant tasks and Maven plug-ins.
Complexity has been shown to correlate to defects; therefore, it makes sense to monitor a code base’s complexity values and take appropriate actions to lower defect risks via test cases and refactoring. Dig it?
comments off Saturday 11 Mar 2006 | Andy | Code Metrics, Continuous Integration, Developer Testing
Peer based code reviews are generally considered copasetic to the overall quality of a code base because they present opportunities for an objective analysis via a second pair of eyes. For this same reason, XP’s pair programming ritual offers some of the same objective analysis benefits. Even static source code analysis tools like Java’s PMD or .NET’s FxCop, which scan files for violations of predefined rules, offer some of the same analysis benefits.
All three of these techniques for code analysis (code reviews, pair programming, and static code analysis), however, are marginally useful if they are capriciously applied- the simple reason being that their analysis benefits fade over time without proactive reinforcement. Moreover, the first two techniques, code reviews and pair programming, are performed by humans, who are error prone and have a limited capacity when compared to computers (and disco moves).
Code reviews, when conducted efficiently, can be impressively effective; however, they are run by mortals, who tend to be emotional (meaning that it can be difficult for a colleague to tell another their code stinks). More importantly, people collaborating in a work environment have the tendency to subjectively review one another’s work. There is also a time cost associated with code reviews, even in the most informal of environments.
Pair programming has also been shown to be effective when applied correctly. Having another pair of eyes constantly reviewing code can yield higher quality code; however, organizations practicing this innovative technique are in a minority. Pairs also suffer the same issues of emotion and subjectivity (disco dancing is another story, however).
The difference with a static analysis tool is two fold, however. First, these tools are incredibly cheep to run often. They only require human intervention to configure and run once- after that they are automated. Thus, these tools offer a time saving aspect. Second, these tools harness the unflinching and un-forgetful objectiveness of a computer. A computer won’t offer those “your code looks fine if you say mine looks fine” code review compromises nor will your computer complain about bio-breaks and personal time if you decide to run an inspection tool every time the CM repository changes.
These tools are also customizable- organizations can choose the most relevant rules for their code base and run these rules every time code is checked into the source repository. These tools become, in essence, tireless watchers of source code, which is practically impossible to mimic with human intervention.
These tools also work incredibly well in geographically distributed environments (i.e. some developers working from home, others at the office, and others in another state, country, continent, etc). Having a human manage the review of all code in this scenario is a costly proposition!
Automated static code analysis scales more efficiently than humans for large code bases; furthermore, some tools offer hundreds of different hip rules, which a human can’t possibly remember while reviewing a series of files. Moreover, running a tool’s myriad rules against your code base will take less time than having your partner review one package.
Automating code inspections with analysis tools handles 80% of big picture and allows humans to intervene in the 20% that matters. For instance, Java’s PMD will run 180+ rules against a file every time it changes. If a particularly concerning rule is violated, such as a high cyclomatic complexity value, someone can take a look. Can you imagine trying to accomplish this process manually? Why would someone want to? That’s so establishment!
comments off Thursday 09 Mar 2006 | Andy | Code Metrics, Continuous Integration
Because most code coverage tools instrument a hip code base with additional behavior for reporting purposes (i.e. the code is instrumented with “listenersâ€, which report when they’ve been executed), tests run slower than they do in non-coverage scenarios. This can have negative affects in a continuous integration environment if the coverage process isn’t well thought out. Running the coverage process every time code is checked into a repository is most likely overkill. In fact, running the coverage process more than once a day is most likely overkill.
If there are tripped out strategies for running tests at different intervals (which map to test categorization), it makes sense then to create an additional strategy where the coverage process is run once a day as part of each categorical test run. For example, every time the repository changes, the unit test process is run. At regular intervals throughout the day, component tests are executed and most likely, once a day (usually during the evening), system tests are run. After the system test process is run, another series of tests can be run where coverage is turned on (i.e. unit tests run against an instrumented code base, component tests run against an instrumented code base, and then system tests run against an instrumented code base). This process will create a series of reports which can then be viewed by the team the following morning.
Because three different reports are created in this process, each one must be viewed with an eye towards the fact that uncovered code in one report may show high coverage in a different report. For example, class Foo may have 0% coverage in the unit test report; however, it may show high coverage rates in the system test report.
Also note, because three coverage reports are going to be run, the build process must be configured so as to not overwrite the previous smokin’ coverage report (i.e. if the build isn’t properly configured to move or write the report to a unique location, the component test coverage report may overwrite the unit test report). Some tools, like Java’s Cobertura have a merge capability which can facilitate in the creation of one master report too.
For example, an Ant target is defined below which merges the Cobertura coverage reports from three different test runs.
<target name="merge-coverage" depends="all-coverage-run">
<cobertura-merge datafile="${cobertura.all.ser}">
<fileset dir="${base.dir}">
<include name="${cobertura.comp.ser}" />
<include name="${cobertura.unit.ser}" />
<include name="${cobertura.sys.ser}" />
</fileset>
</cobertura-merge>
<mkdir dir="${cov.report.dir}"/>
<cobertura-report format="html"
datafile="${base.dir}/${cobertura.all.ser}"
destdir="${cov.report.dir}" srcdir="${src.dir}" />
</target>
Because the coverage process affects the performance of the code under test, care (and a little bit of peace, love, and grooviness) must be taken if performance tests are mixed in. For this reason alone, it is highly recommended not to run performance, stress, or load tests during the coverage process.
For example, below is the JUnit task’s batchtest element for running a series of component tests with coverage turned on. Note how a few tests (corresponding to load, stress, and performance categories) are excluded from the run.
<batchtest todir="${testreportdir}">
<fileset dir="test/component">
<include name="**/*Test.*" />
<exclude name="**/*StressTest.java" />
<exclude name="**/BatchDepXMLReportPerfTest.java" />
<exclude name="**/BatchDepXMLReportLoadTest.java"/>
</fileset>
</batchtest>
Like test categorization and test frequency strategies in a continuous integration environment, copasetic coverage report frequencies and configurations must be properly thought out so as to obtain the maximum benefit with as few headaches as possible. Just remember to not be fooled by the coverage report. Dig it?
comments off Sunday 05 Mar 2006 | Andy | Continuous Integration, Developer Testing
Categorizing developer tests into three respective disco buckets (unit tests, component tests, and system tests) has efficiency benefits when it comes to Continuous Integration. For example, running system tests every time the repository changes is a time and resource consuming task that may not be worth the effort. Why not run unit tests every time some one checks code in as they are cheap to run, then schedule periodic intervals to run component tests and then another interval for system tests? Those intervals can be increased as iterations come to a closing and elongated at the initial stages too.
Because it’s their bag, frameworks like NUnit for .NET and JUnit 4.0 and TestNG for Java have annotations, which make categorizing tests quite easy to implement; however, in other frameworks, segregating tests is a bit more challenging.
For example, with pre 4.0 JUnit versions, there is no copasetic mechanism within the framework itself or within Ant to easily divide tests up into three groups. This can be achieved, however, with a simple naming scheme or, even easier, with a hip directory strategy.
One best practice for developer testing is to place unit tests in a separate directory than that of the source code. For example, a project directory structure would have a src folder for the source code and a test folder for associated tests.
A sample project would have a root directory like this:
$ ls -lt ./
total 62
drwx------+ 6 Andy.Glover usrs 0 Feb 19 21:37 test
drwx------+ 4 Andy.Glover usrs 0 Dec 27 21:20 src
-rwx------+ 1 Andy.Glover usrs 32452 Nov 15 17:51 build.xml
-rwx------+ 1 Andy.Glover usrs 260 Aug 30 2005 build.properties
The src directory would obviously contain directories which map to source code packages, man.
With the test directory, categorization is possible by creating three additional internal directories: unit, component, and system. For example, the directory listing would look like this:
$ ls -ltr ./test
total 0
drwx------+ 4 Andy.Glover usrs 0 Mar 1 22:09 unit
drwx------+ 2 Andy.Glover usrs 0 Mar 1 22:09 conf
drwx------+ 4 Andy.Glover usrs 0 Mar 1 22:11 component
drwx------+ 3 Andy.Glover usrs 0 Mar 1 22:12 system
There is a unit, component, and system directory listed above, which would then contain the associated tests for each category. Note, the conf directory would hold associated properties files, etc required for testing.
The unit directory, for example, would have a directory structure which maps to the unit tests’ package names (which usually maps to the corresponding class under tests’ packages) like this:
$ ls -ltr ./test/unit/test/com/van/sedna/frmwrk/filter/
total 12
-rwx------+ 1 Andy.Glover usrs 1190 Oct 25 2004 SimFilterTest.java
-rwx------+ 1 Andy.Glover usrs 2708 Oct 25 2004 RegexFilterTest.java
-rwx------+ 1 Andy.Glover usrs 1678 Nov 20 17:30 ClassFilterTest.java
Now that the tests are segregated into separate directories, your chosen build system needs a trippin’ update. In the case of Ant, three targets are created. One for running unit tests, another for running component tests and another for running those system tests. A fourth target could also be created which calls all three of the previous targets so as to run the entire test suite.
For example, for system tests, the all too familiar batchtest element of the JUnit task would look something like this:
<batchtest todir="${testreportdir}">
<fileset dir="test/system">
<include name="**/*Test.*">
</include>
</fileset>
</batchtest>
Note how there isn’t any special naming scheme going on here- tests are still appended with a Test regardless of granularity.
If the target for running the system tests was named test-system, then running it is as simple as typing ant test-system (don’t forget to set up the associated depends clause for compiling, deploying, etc).
Implementing a categorization strategy for developer tests is fairly easy, so long as the team commits to a common mechanism. Running these categories at various intervals within a CI strategy then becomes a simple matter of calling the proper build target. Dig it?
2 comments Saturday 04 Mar 2006 | Andy | Continuous Integration, Developer Testing, JUnit
The notion of an abstract class or interface doesn’t really exist in Python. To create a base class which isn’t intended to be utilized directly (for example in the template pattern) and which contains abstract methods anticipated to be implemented by subclasses, you can use the not implemented technique.
To do this, first define the indented hip algorithm in the base class, which usually is concrete methods relying on subclass implementations of the abstract methods. Next, define the methods in the base class which must be defined by sub classes and add a raise NotImplemented clause to each one.
For example, in the copasetic abstract base class DefaultDatabaseTest, the setUpOp method makes the following call:
dbseed.RefreshOperation().execute(self._dataSet())
The self._dataSet() call is an internal method (the _ preceding the method name is an indication that this method is intended to be private) . Looking at that method, we see this definition:
def _dataSet(self):
return dbseed.FlatXMLFileRecordProducer(self._context(), \
self.dataSetFile())
Note how this method calls two instance methods- _context (which again is private) and dataSetFile. Examining the dataSetFile method below shows that this method doesn’t do anything interesting (or helpful) in the base class- it merely throws a trippin’ exception!
def dataSetFile(self):
"""
implement this method- provide a location to the file
"""
raise NotImplemented
Therein is the definition of a template pattern algorithm- subclasses must implement the dataSetFile method so that the base class’s algorithm can work properly.
The template pattern is quite handy when it comes to PDbSeed. One can create a PyUnit test class which can serve as a base class for database testing. Here is the full definition of the DefaultDatabaseTest class:
import unittest
import pdbseed.core.dbseed as dbseed
"""
This class is modeled after DbUnit's DatabaseTestCase. This
class is abstract in nature due to the raise NotImplemented
clauses in methods intented to be overridden in subclasses.
Subclasses MUST override: connection(), metaFile(), and
dataSetFile().
"""
class DefaultDatabaseTest(unittest.TestCase):
def setUp(self):
self.setUpOp()
def tearDown(self):
self.tearDownOp()
def setUpOp(self):
dbseed.RefreshOperation().execute(self._dataSet())
def tearDownOp(self):
"""
the algorithm for seeding does a refresh (which is an
insert OR update); therefore, the teardown is empty
"""
pass
def connection(self):
"""
this method should return an instance of
pdbseed.extension.databasetestcase.Connection
"""
raise NotImplemented
def metaFile(self):
"""
implement this method- provide a location to the file
"""
raise NotImplemented
def dataSetFile(self):
"""
implement this method- provide a location to the file
"""
raise NotImplemented
def _context(self):
"""
"""
conn = self.connection()
return dbseed.ContextFactory.createContext(conn.db, \
self.metaFile(), \
conn.dbhost, \
conn.dbname, \
conn.dbuser, \
conn.dbpassword)
def _dataSet(self):
return dbseed.FlatXMLFileRecordProducer(self._context(), \
self.dataSetFile())
"""
This class is a simple struct that represents the
configuration information for a database
"""
class Connection:
def __init__(self, db, dbhost, dbname, dbuser, dbpassword):
self.db = db
self.dbhost = dbhost
self.dbname = dbname
self.dbuser = dbuser
self.dbpassword = dbpassword
Utilizing this class becomes a simple exercise. First, implement the three abstract methods (connection, datasetFile, and metaFile) and then second, write test cases which rely on data in the database to test your application layer.
For example, below is a test case which shows DefaultDatabaseTest in use.
import unittest
import xpdbseed.extension.databasetestcase
from xpdbseed.extension.databasetestcase import DefaultDatabaseTest
from xpdbseed.extension.databasetestcase import Connection
class MockDbTestCase(DefaultDatabaseTest):
def metaFile(self):
return 'C:/dev/projects/pro/dbtesting/conf/metadata.xml'
def dataSetFile(self):
return 'C:/dev/projects/pro/dbtesting/conf/words-seed.xml'
def connection(self):
return Connection(db="mysql", dbhost="localhost", \
dbname="words", dbuser="words", \
dbpassword="words")
def testDataIsThere(self):
"""
test case creates its own connection to the
db to verify that data is already there, in
this case 'pugnacious' is found as it was
seeded via the words-seed.xml file
"""
import MySQLdb
conn= MySQLdb.connect( host = "localhost", \
user = "words", passwd = "words", db = "words")
query = 'select word.part_of_speech from word \
where word.spelling = \'pugnacious\';'
curs = conn.cursor()
curs.execute(query)
wlist = curs.fetchall()
for word in wlist:
self.assertEqual('Adjective', word[0], 'Adj was not returned')
conn.close()
if __name__ == "__main__":
unittest.main()
Hopefully, this template class will make it into the 0.8 version of PDbSeed. Doesn’t it make you want to whip out a couple of test cases? Do you dig it, man?
4 comments Thursday 02 Mar 2006 | Andy | Languages, PyUnit, Python
It’s arguable in this Age of Aquarius that a large portion of software defects are the result of a misunderstanding between what a business expects and what a business eventually gets. Writing high-quality software requirements is an incredibility challenging proposition; however, even well written, hip requirements can be implemented incorrectly (even with a boat load of trippin’ developer tests!).
FIT is a copasetic attempt at bridging this misunderstanding gap with its innovative technique of bringing the writers of requirements into the testing process early through the use of tables. The second article in the series “In Pursuit of Code Quality” explores this fitting framework and even shows how it can fit into a development environment which utilizes JUnit.
Make 2006 a banner year- Resolve to get FIT!