Starting Unit Testing in a Huge Codebase

Most people agree that Unit Tests are a good idea, and most developers try to write them (with varying degrees of success). But the challenge of creating Unit Tests for existing projects can be incredibly daunting to developers.

Many people may not see the need, if an area of code has been working for quite some time then why introduce tests which will take a lot of time and (if you need to refactor to make them work) introduce risk in an area? When I wrote about the value of Unit Tests back in 2015 I postulated that the value of these small executable is not in finding bugs, it’s in preventing bugs in the future.

Unit Testing, unlike exploratory is not about finding issues with existing code, it’s about reinforcing and documenting (through executable code) exactly how each function, class, and property should behave in particular circumstances. If you understand this then you’ll see that the value of Unit Tests does not diminish with an established solution (even if most of the teething bugs have already been worked out). In fact it makes your existing code bases much easier to safely maintain!

So the question becomes how and where do we start?

I’m currently working on an application which has in excess of five million lines of code in it. Some parts are new and some date back to the project’s inception. Writing a full set of tests for the entire application is a monumental (and I have to admit largely pointless) exercise.

What we need to do is look at the areas which are most in flux. If Unit Tests are a technique for helping to protect our software against unexpected change then the areas where they deliver the most benefit are the areas of code which change frequently.

We’re engineers, not psychics (at least I’m not) so use metrics. Look at your Source Control history and see which classes are subject to frequent change (both bug fixes and feature work applies). Target your Unit Tests here, use the 80:20 rule and target your efforts in the right place. One of my favourite sayings applies here… how do you eat an elephant? A little at a time!

Slowly, little by little the classes which see the most changes will stabilise and you’ll introduce less bugs when expanding them or fixing existing defects.

These are my thoughts, do you have any experiences breaking down huge applications into Unit Tested code? How did you do it?

Why write Unit Tests?

Testing is a passion of mine and it’s something I expect to write about a lot more in the future. What I feel is discussed less often though is “Why are we writing tests?”

When many people talk about writing tests they talk about writing Unit Tests, classes and methods broken down into small, isolated areas of code which can be examined with tests to guarantee code quality. Let’s look at an example:

public string ReadText(string filename)
{
  if(File.Exists(filename))
  {
    return File.ReadAllText(filename);
  }
  else
  {
    return null;
  }
}

This code reads the text in a file and returns it, if the file doesn’t exist it returns null.

The thing is, this is trivial code. Writing Unit Tests for something like this is overkill surely? That thirty minutes or so could be invested in the next feature, a bug fix or meeting the tight deadline.

Let’s suppose another developer comes along in a few years time, they see this method and know something their predecessor didn’t. .NET has a FileNotFoundException exception! Deciding to be a conscientious coder they update the method to throw an appropriate exception if the file isn’t found.

public string ReadText(string filename)
{
  if(File.Exists(filename))
  {
    return File.ReadAllText(filename);
  }
  else
  {
    throw new FileNotFoundException(string.Format("The file '{0}' was not found", filename));
  }
}

Unfortunately our conscientious developer missed something. One of the myriad of methods which calls our methods is GetOrCreateFileWithContents

public string GetOrCreateFileWithContents(string filename, string defaultContents)
{
  var currentContents = ReadText(filename);
  if(currentContents == null)
  {
    currentContents = CreateFile(filename, defaultContents);
  }

  return currentContents;
}

Because of our change this method now fails, ReadText throws an exception and the new file is never created in it’s place. This may be an oversimplified example with an overzealous and (dare I say it careless) tidy up but it illustrates the risks we take every day when refactoring and improving code.

This is the true value of Unit Tests, not in finding bugs but in defining the behaviour of the method. If our original developer had invested that extra thirty minutes our Boy Scout would have had some warning when then tried to update the method, they’d have seen that they’d changed the behaviour of the class in an unacceptable way and wouldn’t have made their changes.

The moral of the story? Unit Tests may seem like overkill while you’re writing them. But spare a thought for the poor soul who’s trying to read your methods in a few years time… Leave them a map, a series of executable tests which guarantee that your required behaviour remains unchanged. Your thirty minutes could save them hours, prevent bugs being introduced and help keep your application stable.