Estimable Stories in the INVEST Model

Estimable stories can be estimated: some judgment made about their size, cost, or time to deliver. (We might wish for the term estimatable, but it’s not in my dictionary, and I’m not fond enough of estimating to coin it.)

To be estimable, stories have to be understood well enough, and be stable enough, that we can put useful bounds on our guesses.

A note of caution: Estimability is the most-abused aspect of INVEST (that is, the most energy spent for the least value). If I could re-pick, we’d have “E = External”; see the discussion under “V for Valuable”.

French translation: “Le modèle INVEST – E comme stories Estimables“

Why We Like Estimates

Why do we want estimates? Usually it’s for planning, to give us an idea of the cost and/or time to do some work. Estimates help us make decisions about cost versus value.

When my car gets worked on, I want to know if it’s going to cost me $15 or $10K, because I’ll act differently depending on the cost. I might use these guidelines:

< $50: just do it
$50-$300: get the work done but whine to my friends later
$300-$5000: get another opinion; explore options; defer if possible
$5000+: go car shopping

Life often demands some level of estimation, but don’t ignore delivery or value to focus too much on cost alone.

We’ll go through facts and factors affecting estimates; at the end I’ll argue for as light an estimation approach as possible.

Face Reality: An Estimate is a Guess

If a story were already completed, the cost, time taken, etc. would be (could be?) known quantities.

We’d really like to know those values in advance, to help us in planning, staffing, etc.

Since we can’t know, we mix analysis and intuition to create a guess, which could be a single number, a range, or a probability distribution. (It doesn’t matter whether it’s points or days, Fibonacci or t-shirt sizes, etc.)

When we decide how accurate our estimates must be, we’re making an economic tradeoff since it costs more to create estimates with tighter error bounds.

How Are Estimates Made?

There are several approaches, often used in combination:

Expert Opinion AKA Gut Feel AKA SWAG: Ask someone to make a judgment, taking into account everything they know and believe. Every estimation method boils down to this at some point.
Analogy: Estimate based on something with similar characteristics. (“Last time, a new report took 2 days; this one has similar complexity, so let’s say 2 days.”)
Decomposition AKA Divide and Conquer AKA Disaggregation: Break the item into smaller parts, and estimate the cost of each part — plus the oft-forgotten cost of re-combining the parts.
Formula: Apply a formula to some attributes of the problem, solution, or situation. (Examples: Function Points, COCOMO.)
Challenges:
- formulas’ parameters require tuning based on historical data (which may not exist)
- formulas require judgment about which formulas apply
- formulas tend to presume the problem or solution is well-enough understood to assess the concrete parts
Work Sample: Implement a subset of the system, and base estimates on that experience. Iterative and incremental approaches provide this ongoing opportunity.
Buffer AKA Fudge Factor: Multiply (and/or add to) an estimate to account for unknowns, excessive optimism, forgotten work, overheads, or intangible factors. For example: “Add 20%”, “Multiply by 3”, or “Add 2 extra months at the end”.

Why Is It Hard to Estimate?

Stories are difficult to estimate because of the unknowns. After all, the whole process is an attempt to derive a “known” (cost, time, …) from something unknowable (“exactly what will the future bring?”).

Software development has so many unknowns:

The Domain: When we don’t know the domain, it’s easier to have misunderstandings with our customer, and it can be harder to have deep insights into better solutions.
Level of Innovation: We may be operating in a domain where we need to do things we have never done before; perhaps nobody has.
The Details of a Story: We often want to estimate a story before it is fully understood; we may have to predict the effects of complicated business rules and constraints that aren’t yet articulated or even anticipated.
The Relationship to Other Stories: Some stories can be easier or harder depending on the other stories that will be implemented.
The Team: Even if we have the same people as the last project, and the team stays stable throughout the project, people change over time. It’s even harder with a new team.
Technology: We may know some of the technology we’ll use in a large project, but it’s rare to know it all up-front. Thus our estimates have to account for learning time.
The Approach to the Solution: We may not yet know how we intend to solve the problem.
The Relationship to Existing Code: We may not know whether we’ll be working in a habitable section of the existing solution.
The Rate of Change: We may need to estimate not just “What is the story now?” but also “What will it be by the end?”
Dysfunctional Games: In some environments, estimates are valued mostly as a tool for political power-plays; objective estimates may have little use. (There’s plenty to say about estimates vs. commitments, schedule chicken, and many other abuses but I’ll save that for another time.)
Overhead: External factors affect estimates. If we multi-task or get pulled off to side projects, things will take longer.

Sitting in a planning meeting for a day or a week and ginning up a feeling of commitment won’t overcome these challenges.

Flaws In Estimating

We tend to speak as if estimates are concrete and passive: “Given this story, what is the estimate?”

But it’s not that simple:

“N for Negotiable” suggests that flexibility in stories is beneficial: flexible stories help us find bargains with the most value for their cost. But the more variation you allow, the harder it is to estimate.
“I for Independent” suggests that we create stories that can be independently estimated and implemented. While this is mostly true, it is a simplification of reality: sometimes the cost of a story depends on the order of implementation or on what else is implemented. It may be hard to capture that in estimates.
Factors that make it hard to estimate are not stable over time. So even if you’re able to take all those factors into account, you also have to account for their instability.

Is estimating hopeless? If you think estimation is a simple process that will yield an exact (and correct!) number, then you are on a futile quest. If you just need enough information from estimates to guide decisions, you can usually get that.

Some projects need detailed estimates, and are willing to spend what it takes to get them. In general, though, Tom DeMarco has it right: “Strict control is something that matters a lot on relatively useless projects and much less on useful projects.”

Where does that leave things? The best way is to use as light an estimation process as you can tolerate.

We’ll explore three approaches: counting stories, historical estimates, and rough order of magnitude estimates.

Simple Estimates: Count the Stories

More than ten years ago, Don Wells proposed a very simple approach: “Just count the stories.”

Here’s a thought experiment:

Take a bunch of numbers representing the true sizes of stories
Take a random sample
The average of the sample is an approximation of the average of the original set, so use that average as the estimate of the size of every story (“Call it a 1”)
The estimate for the total is the number of stories times the sample average

What could make this not work?

If stories are highly inter-dependent, and the order they’re done in makes a dramatic difference to their size, the first step is void since there’s no such thing as the “true” size.
If you cherry-pick easy or hard stories rather than a random set, you will bias the estimate.
If your ability to make progress shifts over time, the estimates will diverge. (Agile teams try to reduce that risk with refactoring, testing, and simple design.)

I’ve seen several teams use a simple approach: they figure out a line between “small enough to understand and implement” and “too big”, then require that stories accepted for implementation be in the former range.

Historical Estimates (ala Kanban)

For many teams, the size of stories is not the driving factor in how long a story takes to deliver. Rather, work-in-progress (WIP) is the challenge: a new story has to wait in line behind a lot of existing work.

A good measure is total lead time (also known as cycle time or various other names): how long from order to delivery. Kanban approaches often use this measure, but other methods can too.

If we track history, we can measure the cycle times and look for patterns. If we see that the average story takes 10 days to deliver and that 95% of the stories take 22 or fewer days to deliver, we get a fairly good picture of the time to deliver the next story.

This moves the estimation question from “How big is this?” to “How soon can I get it?”

When WIP is high, it is the dominant factor in delivery performance; as WIP approaches 0, the size of the individual item becomes significant.

Rough Order of Magnitude

A rough order of magnitude estimate just tries to guess the time unit: hours, days, weeks, months, years.

You might use such estimates like this:

Explore risk, value, and options
Make rough order of magnitude estimates
Focus first on what it takes to create a minimal but useful version of the most important stories
From there, decide how and how far to carry forward by negotiating to balance competing interests
Be open to learning along the way

Conclusion

Stories are estimable when we can make a good-enough prediction of time, cost, or other attributes we care about.

We looked at approaches to estimation and key factors that influence estimates.

Estimation does not have to be a heavy-weight and painful process. Try the lighter ways to work with estimates: counting stories, historical estimates, and/or rough order of magnitude estimates.

Whatever approach you take, spend as little as you can to get good-enough estimates.

Related Material

“INVEST in Good Stories and SMART Tasks,” the original article describing the INVEST model.
“Independent Stories in the INVEST Model“
“Negotiable Stories in the INVEST Model“
“Valuable Stories in the INVEST Model“
“Small – Scalable – Stories in the INVEST Model“
“Testable Stories in the INVEST Model“
#noestimates – Exploration of alternatives to estimation.
Composing User Stories – eLearning from Industrial Logic

Postscript: My thinking on this has definitely evolved over the years, but I’ve always felt that Small and Testable stories are the most Estimable:)