TDD: Tension, Release, and Generalization

Test-driven development uses a tight cycle of "test, code, refactor" to develop software.

Tension and Release
I use the analogy of a stoplight: you start with a green light (all tests passing). Then you write a test; often you're referring to classes or methods that don't exist yet, and get a compiler error (yellow light). You fix this error by writing stubs, and when you run the test it fails (red light). Then you add just enough code to make the tests pass (green light).

Part of this style that may not be obvious is that most of the time you have a "green light": just like driving, you prefer not to spend a lot of time with a yellow or red light.

The most obvious use of this principle is in a technique Kent Beck calls "Fake It ('Til You Make It)": you get rid of the red light by making the code return exactly the answer the test expects. For this test:

public void testLength() { assertEquals(3, buffer.size("abc")); }

you might write this code:

public int size(String contents) { return 3; }

The first time you see this approach, you're likely to think "that's cheating!" We're supposed to be writing a program to do some complicated thing, and we come back with a trivial answer like that.

When you have tests that don't pass, you should feel tension: you have tests, and you have code, and they don't agree about how the code should work. It's the kind of tension from holding your breath, or trying to thread a needle, or watching a movie where people don't realize the danger they're in (depending how much tension you like, I guess:) It's unstable: what will happen?

The solution is to get out of the unstable state quickly, and release that dramatic tension.

This is important, because most of our tools work best when the system is working:

If our system works for the tests we have, but not for some tests we don't yet have, we can add those tests.
If other people have made changes, and checked in code with all tests passing, we can integrate their changes and know that if tests fail it's our problem.
If our system isn't as well-designed as it could be, we can refactor it. But refactoring is best applied when all tests pass.

Generalization
Once we've done the "fake it" part, we can use generalizationto do the "make it" part. Generalization is a relative of refactoring: where refactoring tries to preserve behavior, generalization tries to, well, generalize it.

In our example, we created a fake response "return 3" for a reason: 3 is the right answer, determined by looking at the length of the string. We can generalize our answer from "return 3" to "return contents.length()".

Generalization goes beyond the tests we already have, to the ones we haven't written yet. (Refactoring, in its pure form, preserves behavior for tests we have written, as well as the ones we could write.) Generalization thus makes a guess about the tests we would write. Sometimes this guess can be wrong, so we may write other tests that we think will pass, just to verify it.

I think of test cases as the shadows of behavior. We write code that we think will project that shadow. The "Fake It" part is like using a cutout to generate the shadow – it's not a true 3-d object, but it's good enough to show the right shadow. Then generalization can be used to "inflate" the code into a full object.