Programming: Using What’s Hidden

In building a cache for the SortTables™ data viewer, I ran across three hidden things.

An LRU Cache

I want to view data sets that are larger than memory, so I developed a cache that gets rid of the Least Recently Used (LRU) item (on the theory that things that have been referenced recently are most likely to be referenced soon).

I used a classic implementation, a dictionary pointing into a doubly-linked list. The dictionary finds an item quickly. The doubly-linked list easily moves the most recently referenced item to the front (leaving the LRU at the rear).

My implementation had a doubly-linked list with head and tail pointers. I identified these four cases: empty list where head=tail=nil, one-item list where head=tail=item, two-item list where head.next = tail and tail.prev = head, and an n-item list where nodes in the middle don’t interact with head or tail.

Four types of lists: empty, one-item, two-item, n-item

The key operations are moving an item to the head, and removing the item at the tail.

An Unhappy Result

I implemented these cases, but I wasn’t happy with the result: I could see that the code for handling the head and tail wasn’t symmetric.

But – my operations were so limited they didn’t provide a non-destructive way to examine the list.

There’s an idea from abstract data type theory: hidden operations (and types). The public operations of a type may not be powerful enough to specify its desired behaviors. You may need hidden operations as well. (This always struck me as a profound insight.)

The public operations of a type may not be powerful enough to specify its desired behaviors. You may need hidden operations as well.

The hidden operations may be used to specify behavior, but they’re not made accessible to clients of the type. Furthermore, just because the spec uses them doesn’t mean the implementation has to have them. (It may find another way to meet the spec.)

I wanted to maintain the invariant that “the list traversed backward equals the reverse of the list traversed forward”. I defined a hidden function that could reveal enough to establish this, and made it accessible to the test. That quickly revealed that my implementation was as broken as I’d suspected.

The first hidden: hidden operations. Good tests can usually depend just on the public interface, but that is not always true.

Addressing the Broken Code

You might think that with the hidden operations in place, I would go back and tug my code into working shape.

But I was unhappy: my tests and code did not give me confidence. So I deleted them (keeping only the method declarations) and started over.

The second hidden: the hidden option of reverting bad work rather than pushing forward with it.

A Different List

Part of what made that previous version so painful was that it had so many cases to cover variations of head and tail nil or not and the possibilities for whether head = tail.

It had been a long time since I’d implemented a doubly-linked list, and I’d forgotten how much smoother things go if you use a sentinel node. This node doesn’t contain real data, and it’s never revealed to the user, but it means head and tail always have something to point to, so it gets rid of many special cases. (In fact, this node acts as both head and tail.)

If you’re careful not to remove from the empty list (when sentinel.next = sentinel.prev = sentinel), cases for 1, 2, or more items now all act the same.

List with a hidden sentinel: empty, one-item, or many items

This came together much more quickly than the original (flawed) implementation, and the result was simpler and suitably symmetric:)

The third hidden: a hidden node. Hidden nodes (or other structures) can simplify code by making the internal representation more uniform.

Lessons

  • Tests occasionally need privileged access to (otherwise) hidden operations to let them test a type properly.
  • The hidden option to throw away and re-do something that came out poorly can be cheaper and more reassuring than force-fixing something sub-par.
  • Sentinels and other hidden structures may let you simplify code.

References