February 2006

The Fuzz on Developer Testing

I often find that people use the term “unit test” rather broadly. This can cause confusion, especially when people start claiming their unit tests “take too long to run”. Defining a common vocabulary for developer tests can assist in categorizing them into neat-o groups, which can make all the difference in creating an effective continuous integration process.

Unit Tests- Unit tests verify the behavior of small elements in a software system, which are most often hip single classes. Occasionally though, the one to one relationship between a unit test and a class is slightly augmented with additional classes because the classes under test are tightly coupled. Because of this small issue, it can be helpful to further segregate unit tests into two types: isolated unit tests and semi-isolated unit tests.

The key aspect, however, is that unit tests (regardless of isolation) do not rely on outside dependencies such as databases, which have the tendency to increase the amount of time it takes to set up and run tests.

Unit tests can be created and run early in the development cycle (like day one); furthermore, because of the rapid time between coding and testing the results, unit tests are an extremely efficient way of debugging.

Because it’s my bag, below is an example unit test written in Ruby, which verifies the behavior of a filtering type.

require "test/unit"
require "filters"

class FiltersTest < Test::Unit::TestCase

  def test_regex
    fltr = RegexFilter.new(/Google|Amazon/)
    assert(fltr.apply_filter("Google"))
  end

  def test_simple
    fltr = SimpleFilter.new("oo")
    assert(fltr.apply_filter("google"))
  end

  def test_filters
    fltrs = [SimpleFilter.new("oo"), RegexFilter.new(/Go+gle/)]
    fltrs.each{ | fltr |
      assert(fltr.apply_filter("I love to Goooogle"))
    }
  end
end

This test is extremely simple and runs in a flash. There is little to no set up and no outside dependencies either.

Component Tests- Component or subsystem tests test portions of a system and may require a fully installed system or a more limited set of external dependencies, such as databases, file systems, or network endpoints to name a few, man. These tests verify that components interact to produce the expected aggregate behavior.

A typical component test requires the underlying database to be running and may even cross architectural boundaries.

Because larger amounts of code are exercised by each tripping test case, more code coverage is obtained per test; however, these tests have the tendency to take longer to run than unit tests.

Below is an example of a component test which utilizes DbUnit to seed a database and then attempts to find data based upon the contents of the DB.

package test.org.aglover.words.dao;

import java.io.File;
import java.sql.Connection;
import java.sql.DriverManager;
import java.util.Iterator;

import junit.framework.TestCase;

import org.aglover.words.bizobj.IDefinition;
import org.aglover.words.bizobj.IWord;
import org.aglover.words.dao.impl.WordDAOImpl;
import org.dbunit.DatabaseTestCase;
import org.dbunit.database.DatabaseConnection;
import org.dbunit.database.IDatabaseConnection;
import org.dbunit.dataset.IDataSet;
import org.dbunit.dataset.xml.FlatXmlDataSet;

public class DefaultWordDAOImplTest extends DatabaseTestCase {

    protected IDataSet getDataSet() throws Exception {
        return new FlatXmlDataSet(new File("test/conf/wseed.xml"));
    }

    protected IDatabaseConnection getConnection() throws Exception {
        final Class driverClass = Class.forName("org.gjt.mm.mysql.Driver");
        final Connection jdbcConnection =
       	 DriverManager.getConnection(
           "jdbc:mysql://localhost/words",
       	   "words", "words");
         return new DatabaseConnection(jdbcConnection);
    }

    public void testFindVerifyDefinition() throws Exception{
        final WordDAOImpl dao = new WordDAOImpl();
        final IWord wrd = dao.findWord("pugnacious");
        for(Iterator iter = wrd.getDefinitions().iterator();iter.hasNext();){
            IDefinition def = (IDefinition)iter.next();
            TestCase.assertEquals(
                 "def should be Combative in nature; belligerent.",
                 "Combative in nature; belligerent.",
                 def.getDefinition());
        }
    }    

    public DefaultWordDAOImplTest(String arg0) {
        super(arg0);
    }
}

As you can see, this test case requires some set up! There has to be a database in place as well as a file (the xml one which contains all the data for DbUnit). This test case could take a few seconds to run- not a lot in isolation, but think about running this as part of a larger build with 100’s of tests. Those seconds can add up to hours.

One key difference between component level tests and higher level testing, like system tests (defined next) is that component level tests exercise code via an API, which may or may not be exposed to clients. In the code above, an object in a DAO layer is essentially tested via an exposed interface. Another example of a component test is exercising an Action class in a Struts architecture via the StrutsTestCase framework, which in this case, obviously requires a database to be running; however, the container is mocked out and the API exercised isn’t necessarily exposed to clients.

package test.com.acme.mein.web.prot.action;

import test.com.acme.mein.web.action.frmwrk.DefMeinMockStrutsTestCase;
import com.acme.mein.businessobject.impl.project.Project;

public class ProjectViewActionTest extends DeftMeinMockStrutsTestCase {
   public void testProjectViewAction() throws Exception{
	this.addRequestParameter("projectId", "100");
	this.setRequestPathInfo("/viewProjectHistory");
	this.actionPerform();
	this.verifyForward("success");

        Project project = (Project)this.getRequest()
          .getAttribute("project");
            assertNotNull(project);
            assertEquals(project.getName(), "DS");
  }

  protected String getDBUnitDataSetFileForSetUp() {
	return "dbunit-seed.xml";
  }

  public ProjectViewActionTest(String arg0) {
	super(arg0);
  }
}

This type of test is also commonly referred to as an Integration Test. The difference between this type of test and a system test is that Integration tests (or component tests or subsystem tests) don’t always exercise a publicly preferable API.

System Tests- These tests exercise a complete software system and therefore, require a fully installed system, such as a servlet container and associated database. These tests verify external interfaces, like web pages, web service end points, or GUIs, work end-to-end as designed.

Because these tests exercise an entire system, they are often created towards the latter cycles of development; furthermore, these tests have the tendency to have lengthy run times, in addition to prolonged set up times. What a trip!

Keep in mind that these tests are fundamentally different than Functional Tests, which test a system much like a client would use the system. For example, in the code below, this test mimics a browser by manipulating the site via HTTP; however, this test doesn’t use a browser. A framework like Selenium, which drives a browser, can be used to create Functional Tests.

The code below is an example of a JWebUnit test case, which attempts a website login and then verifies the attempt was successful.

package test.com.acme.web.cve;

import net.sourceforge.jwebunit.WebTestCase;

public class LoginTest extends WebTestCase {

  protected void setUp() throws Exception {
	getTestContext().
         setBaseUrl("http://pone.acme.com/meinst/");
  }

  public void testLogIn() {
	beginAt("/");
	setFormElement("j_username", "aader");
	setFormElement("j_password", "a1445");
	submit();
	assertTextPresent("Logged in as aader");
  }
}

While it may not be obvious in the code above, the entire system (a servlet container and a database) has to be installed and running for this test case to work. Note that the set up here isn’t in the test case, but part of a larger aspect of the build.

There are other types of developer tests which, in effect, cross these boundaries (such as performance tests); however, from a high level, these are my suggested terms. Dig it?

Reliability’s Skinney

It’s a well known facet of systems engineering that the reliability of a linear system is the product of the reliability of each of the system’s components. For example, imagine a hip system with three components shown below.

system of 3 components

Each component in this example system has its reliability measured and the values are each determined to be 90%. If you weren’t a systems engineer (like most of us), you’d probably figure the reliability of this entire system is then 90%. That answer, however, isn’t correct: .90 * .90 * .90 is actually .73. That is, the overall reliability of this system is 73%.

Ever driven across a bridge that was 73% reliable? If you had a pen that only worked 73% of the time, wouldn’t you throw it out?

We assume most bridges we drive over are 100% reliable and most pens we use are 100% reliable until they run out of ink. To gain that reliability the builders of bridges and makers of pens ensure reliability at the lowest possible level because that’s the only way to ensure the overall reliability.

This principle, by the way, is why in the Golden Age of Disco Japanese car makers began to eclipse US automakers in sales. The reliability of Japanese made cars were simply much better than US counterparts because they realized they had to ensure reliability at the lowest possible level.

Now imagine a software system, which, by the way is nonlinear (which essentially means you have to also consider the reliability of the interface or connector between each object). Ever worked on a software system with three components (i.e. objects)? Most software systems have 100’s if not 1000’s of objects!

If you wanted to build a software application that had an SLA or QLA of 100% (or close) you’d absolutely have to ensure reliability at the individual object level. In fact, if you can’t ensure and measure reliability at the lowest level, you can’t possibly do that at the system level.

Yet, this is how we, as an industry, have largely been constructing and delivering software. Design it, build it, then throw it over the wall to QA, who tests at the system level and inevitably finds some number of defects. At some point, we then unleash the system to its customers, who unsurprisingly also find defects, sometimes to the determinant of corporate profits. That’s so establishment!

Bottom line: if we are to build software systems which are truly reliable, we have to ensure reliability at the object level, which can only be achieved through unit testing. Otherwise, we can’t possibly hope to build highly reliable applications.

Code Coverage Article

Every once in a while in this age of Aquarius, a conversation about coverage reports would leave me motivated to write an article elaborating their dangers. Too often I find them misused as indicators of “test quality” or developer productivity. Similarly, I’ve yet to run across a traditional QA manager who was aware of these reports, much less of their copasetic implications.

I finally put something together in a new series on code quality. Check it out and feel the funk on DevWorks.

« Prev