Swimming in a Rich Domain

Many domains are shallow and impoverished, e.g., apps that exist to write well-understood fields into a database. A rich domain has many concepts (some contradictory); it has lots of history; and it’s full of incidental complexity, the consequences of human decisions and workarounds.

Guitar-shaped swimming pool - by Daniel Spils, CC BY 2.0 - https://www.flickr.com/photos/danielspils/2910492/

One huge challenge for rich domains is that you can drown in all the nuances and variants. A North Star can prevent that.

We’ll look at the domain of music as an example, exploring some of its depth.

Finally, we’ll look at some guidelines that can help.

Your North Star – Pull, Not Push

The key to working with rich domains is to avoid getting sucked into the vortex of complexity. Have a North Star – something that’s driving you to use parts of the domain. Generally, this is the needs of the application you’re working on.

The opposite way puts a large and unnecessary barrier in your way. You end up spending a lot of time handling complexity that your app may well not need. The up-front analysis can become arbitrarily large. (People can spend years getting a PhD on some of the fine points.)

Better to ask, “What does my system need?” and target those areas early on. In especially large domains, you may need to work incrementally, handling the most important cases first.

I worked in library systems for a while, and I recall a project that had to deal with bibliographic records in MARC format. I was struck by how many kinds of titles there are – cover title, title page, spine, etc. You could spend a lot of time understanding each of these and how they’re represented – but for most applications, you just want to include the title as a searchable variant and not worry about the rest.

There are people whose job it is to worry about all those titles, and some apps have to treat them as a core subject. But it’s probably not the first thing an app needs to handle.

Let your system pull what’s needed from the domain; don’t have the domain push everything it knows into the application.

Music – An Irregular, Rich Domain

My most recent program is exploring tune similarity. I’m interested in old-time music, including the Childs’ ballads, and these tunes sometimes have hundreds of variants.

A lot of these tunes are available in “abc” notation. Here’s the start of “Twinkle, Twinkle, Little Star”:

CC GG | AA G2 | FF EE | DD C2 |

Letters are note names; numbers tell relative note duration.

The heart of the format is simple, but it’s got many features (and a 100+ page description of them).

What’s a Note?

The first analysis I’ve done is called a Parsons Code. It tells whether each note goes up, down, or repeats its predecessor.

For “Twinkle, Twinkle” (above), the Parsons Code looks like this:

* r u r u r d d r d r d r d

and you can graph it like this:

    /‾\
  /‾   \_
*-       \_
           \_
             \

Generating the UDR codes is straightforward: match the list with itself off by 1, then compare the notes for up, down, or repeat.

But – when are two notes equal?

When we speak of a note, we usually are talking about pitch (frequency) and duration. Notes have a name A-G, and optional accidentals: 𝄫♭♮♯ 𝄪. A single pitch may have multiple names: F♯ = G♭= E 𝄪.

We also have the notion of a key, which has a default set of accidentals. In the key of D, you have F♯ and C♯. When you write an F in the key of D, it implicitly means F♯. (You can get to a plain F by explicitly writing F♮.)

My Note class reflects that: it can be a plain note, a note with accidentals, or a note in a key with implicit accidentals.

In many situations, to compare notes, you really want to compare the underlying pitch: F♯ and G♭ are the same pitch. (Like everything in music, there can be exceptions but this is at least true for piano music:) So, I’ve separated Pitch (frequency) from the Note.

The Challenge of a Rich Domain

In a rich domain, with an uncertain target, it’s hard to know what’s important. On any give sub-area, you can go deeper and deeper.

Notes are more complicated that I described above. In reality, notes needn’t have fixed pitches; they might use a distinct tuning. On instruments such as guitars, the same pitch might be available on several strings (and these may have subtle differences in tone).

Notes can be played with various touches, at different volumes, with different fingers. They might be preceded by grace notes. They may be part of a phrase, and affected by that. On and on and on.

A Parsons Code analysis doesn’t need any of this, but other analyses might!

The Parsons Code has another challenge. Two songs might be essentially the same, but one version has extra notes. You can hear this if you know both “Twinkle Twinkle” and the alphabet song. They’re basically the same tune, but when you sing “what you are” you use more notes than when you sing “LMNOP”.

We’d like to be able to “quantize” – only consider pitches at the same point in the measure, ignoring the “extra” notes. That will require tracking the rhythm of notes, but it makes the Parsons Code more likely to relate similar songs.

Avoiding Infinite Depth

When the domain seems to go on forever, what do you do?

1. Work from “pull”, not “push” – pull domain ideas from the needs of the application, rather than starting from the domain and pushing toward the application. Just as you can’t please every customer, you can’t allow for every possible application.

For the Parsons Code, I need access to the pitches of a song, not its rhythm or bowing style. For the quantized Parsons Code, I do need the rhythm – but I can still ignore other aspects.

2. Pay attention to the pressures and tensions in what you design. When a new aspect emerges, don’t wait too long to make the code reflect it.

For my current application, I’ve got several changes I need soon:

Separate Note name from Pitch (done)
Make Tune recognize that it’s a series of musical signs, not all of which are notes (mostly done)
Handle more aspects of the input format (even if they’re just ignored)

3. Keep the domain in focus, using it to inform the code’s model, and making sure that input and reporting concerns don’t undermine it.

For example, I don’t yet know everything a Tune will need, but I know that the Parsons Code needs to know the series of pitches.

4. There’s a siren song – “Hurry, hurry – build the next feature!” It pulls us toward shortcuts instead of consolidating – true technical debt. We know what we should have done but our code doesn’t yet reflect it.

For example, I had put off detailed error handling, but I recently made two changes: showing all errors (not just the first), and showing context for where the error was detected. This immediately paid off in clarifying a couple errors I’d misunderstood.

Conclusion

Pull from the domain based on what your system needs
Let the domain inform your understanding
Develop incrementally, paying attention to where your design needs to grow