Internal Objects: Testing Graphics, Part 4

In our final article in the series on testing graphics, we’ll look at testing internal objects (rather than testing at the display level). We’ll look at several decisions around testing.

The first decision concerns levels of the system: where you drive tests and where you observe results. The second decision looks at how you test: by observing state or observing conversations between objects. The final decision looks at whether automation is the best choice at all.

Where Will You Test?

Recall the generic drawing model from part 1:

We can generalize this to any layered system and its tests. (It need not even be a strictly layered system; we’re really starting from an object, then considering objects it talks to, or objects those talk to, etc.)

Generic Layers - where to drive, where to test?

Two key questions are “Where will you drive?” and “Where will you observe?”

We have a choice about where to observe. If we observe close to the layer we’re driving, we’re closer to the semantics of what we’re testing. If we observe from further away, we’re closer to the “real” effects.

For example, consider a video editor where you arrange multimedia clips on a timeline. One test might check that when you select a clip, the selection (a solution domain concept) makes certain menu options available and wraps a blue outline around the selection’s icon.

Another test might trigger selection of an item, then check that “The item I told to select knows that it is selected; no other items think they are selected.” A different test, checking more distant layers, might check whether a drawing command was issued to draw a blue rectangle at a certain location. At the display level, a test might check the pixel at (100, 200) is blue.

Think of the layers as differentiated vs. generic, or complicated vs. simple. At the high level, there may be tricky rules and interactions. At the display level, it may all be pixels set to different colors.

Considering what layer to test at involves tradeoffs:

How much do we want a unit test vs. an integration test?
Can we tolerate the slower speed of testing through more layers?
Do we understand the impact of an action at all levels, in a way we can conveniently express?

My default stance is to emphasize unit tests where feasible, testing the links of the chain before checking the whole chain. Design approaches such as Model-View-Controller or its cousins help make this easier.

Testing Internal Objects: What You Do or How You Talk?

There are two common ways to check an object’s functionality when it has a collaborator (as do the layered systems we’re talking about).

Think of the Object Under Test (OUT) as one layer, and its director collaborators as being in the next layer.

Imagine that a display object includes these methods in its API:

get(x, y) -> Color
set(x, y, color)

One way to test an Object Under Test (OUT) is to use the real object it collaborates with:

display = new Display();
OUT = new OUT(display)
OUT.doSomething()
assertEquals(display.get(3,17), Color.BLACK)

This test checks the effect of what happens. It gives us confidence that we’ve integrated with the real display, but it can run slowly.

A second testing approach is to fake or mock the display, having it remember what was called:

Display display = mock(Display.class);

OUT out = new OUT(display)
out.doSomething()

verify(display).set(3,17, Color.BLACK)

This test checks the conversation between OUT and its collaborator. One benefit of this is performance: the mock is probably doing a lot less work than the display object. Another benefit is that we can use this approach to derive the necessary API for its collaborator. (Though in this case that API was already provided.)

This approach has a downside: it assumes we understand the collaborator’s API correctly, and that if we follow its protocol, we’ll get a good result. Suppose the display has a method setMode() that must be called before the first call to setPixel(). If we don’t realize that, our OUT will think it’s working fine with setPixel() alone, but when it uses a real display it may fail to work right. This approach also assumes there’s one reasonable way to do things; if our API supported lines and polygons too, we might not know which call to expect.

Unfortunately, graphics toolkits are typically not designed with testability in mind. For example, one project I worked on used an API that would run fine on the programmers’ systems, but would sporadically fail during the automated build and test. It turned out that the toolkit only fired events when tested on a system with a monitor; on a headless system, some parts were never activated. (We used a pool of build systems, some of which lacked monitors.)

Making sure you’re right (without peeking at the display) is hard. Jim Blinn, in Jim Blinn’s Corner: A Trip Down the Graphics Pipeline (p. 73), describes the challenge of keeping the signs right (positive or negative): “You analyze and analyze and still only have a 50% chance of getting it right. That’s why you have to try out your program instead of just theorizing. You have to test your program thoroughly to see if the signs are OK. Then after the fact, you go back and derive why the minus sign had to be there anyway.”

Sidestepping Automation

Most of this series of articles has focused on automation. But before leaving the series, we want to remember that there’s an overall tradeoff of automation vs. manual testing. (Brian Marick has a good discussion.)

In some cases, the risks are not in the graphics part of the application, so spending a lot of testing effort there may be wasted. For example, suppose the system has been designed to thoroughly separate business logic from display (e.g., by using Model-View-Controller (MVC), MVVM, or a related variation.) We may perceive a lot of risk in the business logic, and far less risk in the standardized graphic components.

Automation can be expensive: we have to automate both the driver portion (controlling the object to test), and the verification (how and where we look to check the behavior). Many tests are harder to automate than to run manually, so we have to consider how many times the tests will be performed and whether they’ll pay back any extra cost.

User interfaces tend to change more than the rest of the code, which means end-to-end testing is even more vulnerable to changes: changes in any layer may require a test to be revised.

And yet, even when graphics are simple, it’s easy to make mistakes. I worked with a group that several times did all the work to test and implement a feature, only to have it rejected by the first person who tried it: the programmers forgot to hook up the button that activated it. The simplest end-to-end test would have caught this.

Most of the time, I tend to be guided what Mike Cohn calls the testing pyramid: lots of unit tests, some service tests, and a few end-to-end tests.

But the best testing approach depends on your context: if you’re working on life-critical software, you’ll have a different approach to automation than if you’re working on a text-based game.

It may seem odd to close a series on automation by encouraging you to sidestep automation, but there are always more tests we could create than we have time for, and we have to make economic tradeoffs if we hope to deliver anything.

Conclusion

We’ve looked at three aspects of testing internal objects:

Deciding where to test: the adjacent layer, or something further away?
How to test: by looking at the resulting state, or by monitoring the conversation between objects (typically by a fake or mock object)?
Should we automate at all? It’s an economic decision.

“The Test/Code Cycle in XP, Part 2: GUI” – an early article showing a way to do TDD with visual components.
“Models: Testing Graphics, Part 1“
“Displays: Testing Graphics, Part 2“
“Screen-Based Tests: Testing Graphics, Part 3“

Where Will You Test?

Testing Internal Objects: What You Do or How You Talk?

Sidestepping Automation

Conclusion

Related Articles