TDD is an Alchemy Trick: Points and Partitions

Alchemists sought to transmute lead into gold. TDD turns a small set of examples into code that handles the whole range of inputs. Quite a trick – how does it work? TDD often implicitly uses the idea of partitions; we’ll see how this works.

What’s a Partition?

A partition divides a set into subsets (also referred to as partitions), such that:

No partition is empty
The union of all partitions is the original set
No two partitions have any values in common

For example:
For {a,b,c}, here is a valid partition: {a}, {b,c}
Here’s another one: {a}, {b}, {c}
This one is invalid (as “b” occurs in two subsets): {a, b}, {b, c}, {c}

For integers, a common partition is Positive-Zero-Negative (PZN): {x : x > 0}, {0}, {x: x < 0}
Another partition is {x: x<1}, {x: x >= 1 and x < 7}, {7}, {x: x>7}

TDD via Partitions

The classic description of TDD is Red-Green-Refactor:

Red – Pick an example input value. Write a failing test for it.
Green – Write code that makes the test pass.
Refactor – Improve the code you wrote.
Repeat till done

Adding in partitions reveals there’s more going on:

Red – Pick an example input value from a partition. Write a failing test for it.
Green – Write code that makes the test pass.
Generalize – Modify the code so all tests still pass, and the rest of the partition works too.
Refactor – Improve the code you wrote, generalizing between partitions where possible.
Repeat till all partitions are handled

TDD Micro-Moves

In Test-Driven Development, Kent Beck describes three strategies to get from red bar to green bar:

Obvious Implementation – don’t bother with examples, just implement the code.
Fake It (’Til You Make It) – start by returning a constant, gradually replacing it with variables until you get a full implementation.
Triangulate – generalize from two or more examples.

I think of these as generalizing from zero, one, or many examples. The micro-moves described below are clearly in the same family.

A Running Example

Let’s consider the absolute value function:

            {  -x, if x < 0
abs(x)  =   {   0, if x=0
            {   x, if x > 0

The cases in the definition make the partitions easy to identify. We’ll handle Zero, then Positive, then Negative.

Here’s our TDD session:

Next Test	Code	How
–	`throw`	Initial version
0 (“zero”)	`if x == 0 return 0 else throw`	Hard-coded result (red to green)
1 (“positive”)	`if x == 0 return 0 else if x == 1 return 1 else throw`	Hard-coded result (red to green)
–	`if x == 0 return 0 else if x == 1 return x` `else throw`	Generalize (green to green)
–	`if x == 0 return 0 else if x > 0 return x` `else throw`	Broaden to whole partition (green to green)
2	// same	Confirmatory test (green)
–	`if x == 0 return x else if x > 0 return x else throw`	Harmonize (green to green)
–	`if x >= 0 return x else throw`	Refactor (def. of >=) (green to green)
-1 (“negative”)	`if x >= 0 return x else return -x`	Obvious Implementation (red to green)

In that last step, we used Obvious Implementation, but could have gone more slowly with a hard-coded result + generalize + broaden + refactor.

Let’s look at the micro-moves in more detail.

Choosing Partitions

With TDD, we don’t have to define our partitions all in advance, and we can switch partitions along the way.

It’s fine if you do have a starting strategy for the partitions you want – you have to start somewhere. But if you’re wandering around, you may be able to take a few examples and harmonize them (i.e., triangulation), and that may give you an idea of a better partition.

Our goal: Identify partitions where the abstract behavior is the same.

There’s an art to choosing partitions, and it’s often specific to the function being implemented.

Pattern: Hard-Coded Result

In some ways this is the simplest move – just choose the next input-value, and add code that boils done to:

if input == <<input-value>> { return <<hard-coded-result>> }
else { existing code }

This is a useful and helpful move, but it’s not scalable. Peeling off values, one input value at a time, isn’t enough for most problems.

Pattern: Generalization (Transmutation)

Generalization is the key “trick” in TDD. It moves us from code that works for a specific value to code that supports many inputs.

I think of generalization as a semi-refactoring: a transformation that is valid for the inputs so far, but is not valid with respect to all values. In our example, “1 => x” is not a valid refactoring in general, but it is safe for the tests we have so far.

Generalization uses the existing tests as a guard-rail – we have to keep passing them. This eliminates the most foolish generalizations.

Let’s see it in slow motion:

   … if x == 1 return 1
=> 
   … if x == 1 return x

We went from 1 to “x” because we know (or can convince ourselves) that it’s the generalization that will do the job for the positive numbers. (Not a tough guess, given our spec.)

But what if we did a different generalization?

=>   … if x = 1 return x*x

The tests pass (for 0 and 1). But if we ship this without testing positive numbers further, we won’t realize that we have a defect.

What stops this? Nothing. Nothing guaranteed, anyway. We have to build on what Mike Hill calls “The Judgment Premise” – and use our own human judgment.

This risk is why I talk about the “trick” of TDD rather than the “algorithm” – there is no algorithm, and there are no guarantees. We have to build our judgment and confidence, so we have an intuition when things aren’t right.

In partition terms, here is the key challenge: Can we write generalized code that works for each member of a partition, even though we’re not checking each member?

Pattern: Broadened Partition

We may start with a specific example, good for only one value.

    … if x == 1 {return x}

If we’ve generalized it properly, we expect that it will work for not just one value from the partition, but all of them.

If – We have a condition clause (or other code) that lets us run the code for a specific value, but not for all values we have in mind for our partition,

Then – Make the condition reflect the full scope of the partition.

This is another tricky generalization – we move our claim from “this value works” to “the whole partition works”, without providing any more “proof” (to the extent tests are proof).

Pattern: Confirming Test

Many times, we write a TDD test expecting “first time red” – the test should fail, and if it doesn’t there’s a problem (at least in our head). But from a partition point of view, sometimes we write tests that we expect to pass (“first time green”), confirming that we can take an arbitrary member of our partition, and the current generalized code will handle it.

I often use this as a chance to try a complicated situation, especially if I’ve only tested the pieces solo or pairwise. I have in mind something like Hans Buwalda’s “soap opera” testing, but smaller. (In soap opera testing, you create an extreme situation, rather than a simple one.)

In the absolute value example, the test for “2” would catch us if we had tried to generalize to x*x.

Pattern: Harmonized Code

Harmonization is another form of generalization. Its goal is to make two or more pieces of code more similar, so that we can merge their partitions.

For example:

    if x == 0 return 0 else if x > 0 return x…
=>
    if x == 0 return x else if x > 0 return x…

Looking only locally, there’s no reason “x” is better than 0 for the first case. But looking across the whole method, we can see it makes the cases more uniform.

Having both branches the same sets up a refactoring that will eliminate the duplication. This is an example of making things worse (more duplication for now), so we can make them better (even simpler later).

Harmonization is a way to implement Kent’s Triangulation strategy: implement multiple examples, then harmonize them if you can.

Refactoring: Merge Conditional Branches

If we have code like this:

   if condition1{ block-A } 
   else if condition2 { block-A} 
   else …
=>             (assuming no side effects in the conditions) 
   if condition 1 || condition2 { block-A }

In this case, we’re merging the corresponding partitions.

In our absolute value example:

    if x == 0 { return x }
    else if x > 0 { return x }
    else …
=>
    if (x >= 0) {return x}
    else …

Our original partition was Positive-Zero-Negative, but now we’ve switched to {x: x < 0}, {x : x >= 0)]

Conclusion

The alchemy of TDD is based on generalization: given one or more examples, can we write code that will handle the partition the values are from? If so, the lead turns into gold.

If not, we’ve got some things to learn: how to make better partitions, new design ideas, how to generalize better, … something.

We succeed often enough to make TDD a powerful tool.

References

“The Judgment Premise”, by Mike Hill. Retrieved 2021-01-10.

“Soap Opera Testing”, by Hans Buwalda. Retrieved 2021-01-10.

Test-Driven Development, by Kent Beck.