Avoid Irrelevant Symmetry!

When we create test data in TDD, we do it to drive out new capabilities. But all test data is not of the same value. If your data has the wrong symmetry, it can mislead you about what’s working. 

Object Domains

Classes are built from primitive or predefined classes, or other classes. The objects from a class form a domain; the methods are the operations. 

This lets us apply tools from algebra to understand how domains fit together. “Abstract algebra” helps us generalize beyond numbers to many types of objects. 

Programmers are used to working with different types, but get a feeling for the structure of important types:

  • arithmetic
  • boolean logic
  • strings
  • sets
  • lists

When we create new types, we can look to these types for analogous structures. 

Avoid Irrelevant Symmetry

Domains often have aspects that create various symmetries. A symmetry is an aspect that doesn’t change under some transformation. For example, a shape with horizontal symmetry is the same as its mirror image. A triangle still has three sides even if you scale it larger or smaller. 

Why do symmetries create problems? Because it means that we can’t tell whether a transformation was applied or not! We’d get the same output value in either case. 

For example, I’m working on a small app that deals with patterns: a grid of booleans, representing pixels that are on or off. Suppose I want to check that copying a pattern produces an identical result:

XO     copy    XO
OX     ——>     OX

But here’s the code:

for a = 0..<original.width
  for b = 0..<original.height
    copy[a,b] = original[b,a]         - - oops:)

What are the symmetries in my pattern?

  • The width and height are equal
  • The shape is a mirror image on the diagonal

What would be a better input to test with?

XX — Pretty good - not mirrored horizontally or vertically,
OX — The “\” diagonal is different, but the “/” diagonal is 
   — still symmetric. Still has width=height

Or how about this?

XX — Different width and height, no mirror symmetries that I see.
XO
OX

Helpful Laws

You may have learned certain “laws of arithmetic”; some of them apply outside of numbers:) But not all laws apply to every domain. For example, multiplication of numbers is commutative, but matrix multiplication is not.  These laws often reflect symmetries in your domain. 

Combining Values

I’ll use ⨁ and ⨂ to represent arbitrary operations.

Associative: Combine in any order
(a ⨁ b) ⨁ c = a ⨁ (b ⨁ c)

Example: (3+2)+4 = 3+(2+4)

Commutative: Swap order
a ⨁ b = b ⨁ a

Example: 3+4 = 4+3

Distributive: Combine two operators
a ⨂ (b ⨁ c) = (a ⨂ b) ⨁ (a ⨂ c)

Example: 3*(4+5) = (3*4) + (3*5)

Identities

An identity (with respect to an operation) is a value that “disappears” when combined with other values. We’ll call it ε (to suggest “empty”).

Left Identity:
ε ⨁ x = x

Example: 0 + 7 = 7 — 0 is the left identity element for +

Right Identity:
x ⨁ ε = x

Example: 3 * 1 = 3 — 1 is the right identity element for *

Idempotence

Some operations have no effect if they’re applied more than once.

Idempotence:
f(f(x)) = f(x)

Example: sort(sort(array)) = sort(array)

Laws Reflect Symmetry

When various laws apply, they describe symmetries that can make it harder to test confidently.

For example, “+” is commutative in arithmetic: 3+4=4+3. But “-” is not: 3-4 ≠ 4-3

Let’s take an RPN calculator. To use it, you push numbers onto the stack. When you get an operator, you pop the top numbers as its arguments, and push the result onto the stack.

Thus 3+4 is represented as 3 4 +, and during processing the stack looks like this:

      4
3     3      7

Here’s out test:
assertEquals(rpn(“3 4 +”), 7)

And here’s our code:

  :
if (operator(token)) {
  let a = pop()
  let b = pop()
  functionFor[token](a, b)     — a lookup table holds implementation 
                                — of this operator
}

Because “+” is commutative, we don’t notice that we’ve reversed the arguments. 

We’d have detected the mistake if we used this test instead: 
assertEquals(rpn(“3 4 -“), -1)

Similarly, 0 and 1 would make poor test values: 0 is the identity for “+” and “-”, and 1 is the identity for “*”, and 1 is a right identity for “/”. 

Zero and one often make poor test data values. 

Sometimes, You Want Symmetry

We can flip this around. Sometimes we need to make sure symmetries exist.

Consider the arithmetic domain: arithmetic is supposed to have most of the laws above, and if we’re checking arithmetic, we certainly want to check those properties. In this case, 0 and 1 will be good test cases (and required, in the case of identity laws). 

If we were checking for a mirror-image facility, we’d want to make sure images with horizontal symmetry are equal to their flipped version, and that flip(flip(image)) = flip(image) for them as well. 

Vary Your Test Data

One symmetry is not in our domain, but in our tests: we may use the same test data for a series of “independent” tests, either calling a common setup, or copy-pasting it. 

I learned from Brian Marick years ago [no reference]: vary your test data. It isn’t a guaranteed way to identify problems, but it’s not hard to do, and increases your chances just a little bit. 

I especially like this in the context of “Fake It ’Til You Make It” in TDD. There, you write a test and then start with simplistic versions of the production code, making it increasingly sophisticated. But because you’re generalizing from one or a few test cases, you may not realize you haven’t fully solved the problem. Varying the test data in other tests may reveal this omission.

I’ve also seen this issue when one test causes creation of two or more objects. It’s easy to forget to fill in all behaviors for all the objects involved. Varying test data gives you a little extra insurance and a little extra luck.

Example: Top Simulator

The pattern example I mentioned earlier is part of a top simulator I’m writing on Twitch. Let’s look at choosing decent examples for some of the upcoming work.

The top has a row of LED lights, then form concentric rings (from persistence of vision). The current implementation has you specify how long to turn a light on or off, but I want to move to a designer’s view where you create patterns and concatenate one or more together. 

What is this like? It reminds me of appending strings. We know strings have an identity element (the empty string), and are associative (you can append a series of substrings in any order). 

For our case, an empty element makes sense: a pattern with no rows or columns. This is useful to append “visual” patterns together with no gap, whereas you might want some space between letter shapes. So we’ll want to check the identity laws. 

I don’t think we’ll care about associativity – I think we’ll have the whole list of patterns-to-append at once. This could change, of course.

My images default to 5×7, to accommodate letters. But there’s no deep reason for this, and it makes sense to allow other widths as well. (The height is determined by the number of LEDs.)

So I should eliminate the symmetry of “all 5×7” patterns, and use a mix of heights and widths. That brings up a question: what does it mean to mix sizes? I think each row should be padded with blanks to the maximum width of that pattern, and each pattern should be extended to the maximum height of the pattern it’s joined to.

xxx     xo       xxxxo
xox  +  x     =  xoxxo
oox              ooxoo

I’ll want to test mixed widths and mixed heights, in both orders. (Concatenation is not commutative, but I could imagine accidentally treating “wide + narrow” and “narrow + wide” inconsistently.)

I won’t worry about horizontal or vertical symmetry; I don’t think I’ll write any loops backwards. Using a variety of (non-square) sizes will address diagonal symmetry.

Conclusion

Unexpected symmetry can make poor code appear to work, and make it impossible to tell whether certain transformations have been applied. You can check for “laws” (or properties) of your domain to help identify values that might cause these issues. In some cases, however, you want to check that symmetries hold, so you can’t just forgo all symmetries.

A process-level symmetry occurs when you use the same test values for a series of tests. Varying your test data is not expensive, and occasionally will reveal that you’ve missed some situations. 

References

“Symmetry”, Wikipedia. Retrieved 2021-06-28. 

GraphicalApp Playlist” – in Bill Wake’s YouTube channel – episode Top-21-06 introduces the Pattern class, and episode Top-21-07 addresses patterns concatenation. These are available on Twitch until mid-July, on Youtube after that.