Tag Archives: team

The Vision Thing: How Do You Charter? #agile2011

We held a "Fringe" session at Agile 2011 to discuss how people charter or kick off projects. 

Elements of "Kickoff"

[These are in no particular order.]

  • Vision
  • Release Criteria
  • Success Criteria
  • From and To State
    • Business capability
    • Solution vision
  • Risks / Fears
  • Rallying One-Liner [may match up to Vision]
  • [Early] Backlog (maybe)
  • Mission
  • Team Agreements / Social Contract
  • Community [incl. users, customers]
  • Project Boundary
  • Domain Language
  • Scope Discussion
    • Tradeoffs
    • What's the minimum?
    • "Big rocks"
    • "Not" List
  • Guiding Principles
  • Resources / Constraints

Factors / Approaches / Techniques

  • Innovation Games
  • Metaphor
  • Mini-design studios (e.g., to explore shared understanding of "commitment")
  • Sliders
  • Ranking
  • "Not" List /  In-Out List
  • Sr. and other management present
  • No iteration 0 [get going instead]
    • -or-
  • Iteration zero that includes a skinny end-to-end "Hello World"
  • Timeboxes
  • Quickstart approach
  • Experimentation
  • Inception Deck

Thanks to all who participated! 

Slicing Functionality: Alternate Paths

By Bill Wake, Joseph Leddy, and Kent Beck


When you need to break up a big feature, you often have many choices about how to do so.



One of the basic challenges of software project management is sequencing. You have to do something today and something else tomorrow. Another challenge is the need for accountability and the ability to report progress. You'd like to make progress in a way that everyone appreciates. One way to do this is is to create small increments of business functionality.

On the other hand, a project can feel micro-managed if it has too many too-small pieces. If the pieces are laid out in advance, you reduce the team's ability to respond to discovery and learning. So, we have to slice the system in small pieces, but not too small.

Once you decide to track the development of the system by completing bits of business functionality, you have the problem of which way to slice the system. The same system can be sliced many different ways, depending on the needs and capabilities of the whole team. Sometimes you want to explore the innovative core of an application and you don't need to see the mundane input and output functionality early. Sometimes you want to put a minimal system into production quickly. Sometimes a group has fears about a part of the system that can be addressed by implementing it early. Sometimes the team just needs to get going, so a simply-implemented slice is appropriate. Sometimes you need a sizzling demo.

Exploring all the different ways to slice a system has been largely a matter of intuition. This paper presents a graphical technique for exploring different slicings. It builds on examples the first author drew during the first Programming Intensive Workshop at the Three Rivers Institute (www.threeriversinstitute.org). The format of the workshop is to spend four days implementing simple games as a way to reconnect to the joy of programming. Because we worked on several different games, the workshop provided ample opportunity for slicing systems in various ways.


Real systems are too big to describe in one story. So, you have to split them into a series of stories. But how do you make that split?

Two questions can help you decide:

  1. What slice represents the essence of the system?
  2. What slice will help me learn the most?

The essence of a system is what it does, reduced to its barest form. For example:

  • web sales system: a purchase transaction
  • word processor: enter text and see it on-screen
  • game: some interaction between player and system
  • workflow: a transaction moved from one station to another

The learning side is important as it shows the places where we are most at risk. It's true that you don't always know what you need to learn the most. But the learning by trying something can help keep us out of analysis paralysis.

Games as Design Test-beds

One exercise for thinking about software design is to consider an existing system, and how it might have been designed it in the first place. You can imagine the decisions you might have made and the insights that might have arisen from them. Sometimes this gives you great appreciation for a design move.

It's also helpful to consider systems of moderate size. A real word processor may be way too big for an exercise. But games are often a good size to think about.

We'll show examples of two games as platforms for thinking about how to split stories: TetrisTM and a stacked letter puzzle. The former you probably know. The latter is a type of word puzzle: a quotation is written on multiple lines, then the letters in each column are scrambled. To solve the puzzle, put the right letters in place in the bottom.

Here are sketches for the choices involved in each:


The left side of the diagram shows the key screen, with annotations around it. Around the edges are a number of choices that could make a super-simple version.

  • One-tris: have a single column. Each piece comes in the top, and all you can do is press a key to accelerate. This determines whether you die quickly or slowly. Then you could go on to add collapsing, different shapes, etc.
  • Two-tris: has two columns. You can move the piece side to side as it falls. The diagram says "collapse row" but that could be a second version.
  • Side-tris: has a few columns. This one is a bigger bite to start with. It doesn't have rotation but has more of the whole game.
  • Slide-tris: don't have columns, just set up the interaction of moving side to side. Then add height, piling up, etc.
  • "Extras": add previews, music, scoring, etc. They're all part of the final game but aren't crucial to it.

While I'm sure there are more ways to start, let's consider what these have in common. The first and most important thing is that they all have the essence of both user action and computer response.

We could imagine a different one-tris, that just had the computer spitting out blocks and dropping them (with no user interaction). This split isn't as good: by omitting the interaction side, it leaves out a key part of the core of the problem. (That's why a simplified domain provides good practice; in a real domain, we might go much further down the path without realizing that we were unbalanced.)

So which is the best path? We haven't implemented Tetris (especially 4+ ways), so we have to speculate. But either a simplified version of "one-tris" (with blocks falling but not stacking), or "slide-tris" (with no blocks at first), seem like good simple starting points for introducing blocks, motion, and interaction.

Stacked Letter Puzzle

Here's a sample stacked-letter puzzle:


You create the puzzle by writing a quotation into a grid, then pulling the letters in any column to the top, and scrambling them.

There are several possible starting places:

  • a one-dimensional puzzle. (In the example shown, ONED is the answer, partially filled in.)
  • a tool to create puzzles
  • a grid with a fixed size
  • a variety of interaction possibilities (typing vs. dragging)

In this case, the implementation (DewdropTM, by the first author) started with a one-dimensional puzzle. Looking back, it might have been better to have started with a one-letter puzzle, and then worked toward height before worrying about length. But it seems like either approach should work, and converge toward the same place.

Three Tricks

Following are a few approaches that might help you think of different ways to evolve a system.

Ontogeny Recapitulates Phylogeny

There was a theory in evolutionary biology (no longer believed) that said, "Ontogeny recapitulates phylogeny." (Memorable today only because of all the syllables:) It claimed that the stages of growth of an organism correspond to the species' evolutionary history. Thus, for example, a mammal starts out as a single cell, then multiple cells, then adds a skeleton, becomes fish-like, adds mammal features, then fully develops into itself.

It might not be much use as a biological theory, but it can be an inspiration for design choices. Think how your product and its competitors have evolved over time. If you're starting fresh, you can use that as a guideline to what's been most important to the market. So, for a word processor, you might consider basic editing first, then printing, then styles or perhaps spell checking, then drawing tools, grammar checking, kerning, etc.

This isn't a hard and fast rule; if spell checking were going to be the amazing feature that would set a product apart, it should be explored sooner. But this guideline does remind you to consider where other groups have previously found it easiest to get incremental value.

Transparent Overlays

Back when encyclopedias were books printed on paper, there was often an entry on the human body that let you look at the body through a variety of transparent overlays. The basic page had a picture of a skeleton. You could flip the preceding transparent page over the skeleton and see how the organs fit in. Then you could flip another transparent page, and see muscles over both. Finally, you could flip over a page and see the skin.

You can think of building up a system the same way. A basic Tetris game might start with just a single column. The next version could add multiple columns, then a border with scores, then sounds, then another border with previews. At any point, the system makes sense, even though it's not as complete as the final system will be.

Unlike the encyclopedia, you don't have to put the overlays in a fixed order. You can explore which ordering of overlays will be most valuable, and you can change your mind later as you learn more.

Bounding Box

What if your system had to live under different constraints? How can you make your system be as valuable as possible in a constrained environment? Consider these possibilities:

  • Character-Based User Interface. What if you had no graphics, just a 24-line by 80-column display?
  • Voice-Based User Interface. What if your system ran without a screen? Could you make it valuable if used over the phone?
  • Cell-Phone Screen: What if you had to put your system on a cell-phone? You get two square inches for the user interface. What's important enough to keep?

Practice: Wiki

Want to give it a shot? Pick a game and try. Or if you'd like to practice with a more realistic system, try the wiki. (See http://www.c2.com/cgi/wiki if you're not familiar with wikis.)

In brief, a wiki is a website that lets people create, edit, and cross-link web pages. Following is a list of features. How would you arrange them to create something quickly that captures the essence of the system, maximizes value as you deliver the pieces, and lets you learn what you need to along the way? 

  • View a web page
  • Edit text on a page
  • Create a new page
  • Wiki markup (see http://www.c2.com/cgi/wiki?TextFormattingRules)
  • WikiWords that link to an existing page
  • WikiWords that link to a "create me" page
  • EditCopy – retaining a copy of the previous version of a page
  • Reverse links (click on a page's title to see pages that refer to it)
  • Find page, by searching in title or body
  • "Like pages", those with a common word at the beginning or end of the name
  • Sister sites: links to other wikis with a page of the same name
  • RecentChanges: links to pages that changed in the last few days
  • Last edited date
  • List of prior versions; history pages (read-only previous versions)
  • Images
  • Anti-spammer "tricks" (e.g., using the same results page URL for all searches)
  • Spell checking
  • Converting spaces to tabs (to support browsers that can't)
  • User names (so RecentChanges can show that)
  • Marking "new" pages in RecentChanges
  • Deleting pages

Here are things not (now?) in the c2 wiki that other wikis have added:

  • Tables
  • Unicode support
  • User logins
  • SubPages: a hierarchical wiki namespace
  • Free links: links not restricted to being a WikiWord
  • RSS feeds
  • Email notification of changed pages
  • Alternate markup, e.g., TeX
  • Polls
  • Active content (e.g., calculations)
  • File uploads
  • Minor edits (flagged so they don't show up in recent changes)
  • Piped links: the target of the link doesn't match the displayed name
  • WYSIWYG editing
  • Merge support (for when changes conflict)

If you try this exercise (with a game, wiki, or something else), we'd be happy to link to your results.


Real systems have complex clusters of functionality, but they benefit from starting as skinny as possible. It's a useful skill to be able to make this split. A graphical approach, identifying a key screen, and breaking it up into "overlays", can help you explore alternatives.

[First draft  February, 2005. Drawings were made at at Kent Beck's Programming Intensive Workshop, Feb., 2005. Revised and published,  July, 2006.]

Agile Project Management, XP Style


How do you plan the overall shape of a project in XP? This article summarizes planning with little reference to the programming aspects of XP.



  • Team: the customers, programmers, and managers who are jointly working on the system.
  • Customer: the person or group defining the need for the system and what it does.
  • Programmer: the person or group that estimates and implements the system.


  • Exploration: part of the release cycle: when customers create stories, and programmers estimate them.
  • Iteration: part of the release cycle: a fixed-length time for implementation of stories. An iteration is time-boxed: if all stories can't be completed, stories are dropped rather than the iteration extended.
  • Release: delivery of system (usually to end users).


  • Story: a feature or capability of the system that a customer values.
  • Estimate: the cost (usually time) to implement a story. There have historically been two ways of doing estimates in XP: relative (a story is estimated as costing 1 to 3 "story points"), or absolute (so many hours for a pair).
  • Release Plan: a flexible, overall plan that tells which stories are expected in which iteration, for a whole release.
  • Iteration Plan: a small plan that tells which features are expected in the current iteration.

Capsule Summary

In XP, planning at implementation can take place at any time, but an XP project usually has different emphases at different times.

Exploration: The team learns about the problem and the technology options for a solution. The programmers estimate the cost of each story. Exploration typically lasts one to four weeks. Result: All important stories are understood (reasonably well) and estimated (in story points).

Release Planning: The programmers tell how many story points per iteration they expect to deliver. The customer plans which stories will be in the release, and in approximately what order they should be implemented. Result: A release plan.

Iterations: The team implements another group of stories, whichever ones the customer thinks are the most important at that time (and whose cost lets them fit into the iteration).  Iterations have a standard size, typically one to three weeks. Result: Working system that the customer can test.

Release: The team readies the system for release to a broader group. Releases are typically one to three months apart. Result: System delivered to end users.


From even the small description above, we can see several implications:

  • All the cycles are very short. The whole release cycle occurs in weeks to months.
  • The release plan is created approximately a quarter of the way through the release cycle.
  • The plan can be revised (by the customer!) at any time. At the start of each iteration, the customer decides which stories are most important at that time.
  • This flexibility places a burden on the programmers: they must be prepared to implement any feature next. XP uses specific programming techniques to make this possible in the software realm, and teams can use conditional estimates for special cases ("3 for the first one and 1 for each one after that").

The Estimates

The programmers work with the customers to make sure both parties agree on the meaning of a story. Then, the programmers give an estimate representing how much effort they believe is involved to deliver that feature.

Some teams use relative estimates. A common style is 1 to 3 points; anything bigger than 3, the customer is asked to split the story into smaller stories.) The programmers typically think of a 1 as being, "If this were all I was doing, I could do it in 1 week."

Why the relative estimates instead of absolute estimates (e.g., 12 days)? Because people tend to better at relative sizes ("this is about the same work as that"), and they tend to be better at thinking in terms of working time than in estimating their overhead.

Other teams use absolute estimates. (This is the approach Kent Beck is currently recommending.) This is typically done in pair-hours (the cost for two people working together for one hour). 

What if we later find out an estimate is wrong? The team can update their estimates at any time as they learn more. (And yes – if things suddenly get expensive or cheap, it can have a big impact on the release, and the customer may re-assess the overall plan.)

The Release Plan

Once every story has an estimate, the programmers declare how much much capacity they think they will average per iteration ("the velocity"). (A reasonable, conservative guess is "1/4 pair-hour per programmer per week.")

The customer creates the release plan as a list of features in priority order, grouped into iterations according to the estimate and the velocity. The customer can split any story that doesn't quite fit (and have the programmers estimate the new stories).

For example, assuming a declared velocity of 40 hours per iteration and a team planning to do 3 iterations:

Iteration Stories
Iteration 1 Text screen – 10 hours
Simple query – 30 hours
Iteration 2 Boolean query – 20 hours
Simple screen – 20 hours
Iteration 3 Printing – 30 hours
Distribution – 10 hours

The customer can decide whether the goal for the release is a certain list of features or a certain release date. (Iterations have a fixed duration, but the overall release may or may not be time-boxed.) This plan can (and often should) after each iteration, as the team learns more.

The First Iteration

XP suggests a special goal for the first iteration: it needs to provide a minimal, end-to-end solution involving every major component. (Many components will be present in only skeletal form.)

In the example above, the team might split "Simple query" and "Printing," and find a way to fit super-simple versions of each into the first iteration.

The first iteration provides a skeleton for the rest of the implementation, and it helps address high risks in the project. The team is free to completely change the structure around later, but the first iteration provides a starting point.

Other Iterations

Use however many story points the team finished in the last iteration as the estimate for how many points will be done in the next iteration. (This is called the "Yesterday's Weather" rule: the best predictor for today's weather was yesterday's weather.)

For example:

  Expected Velocity Actual Velocity Accumulated
Projected in
3 Iterations
Iteration 1  40 20 20 60
Iteration 2  20 40 60 100
Iteration 3  40 40 100 100

The velocity may bounce around some at first, but it usually stabilizes.

Notice that in the release plan we originally expected to get 120 points in 3 iterations, but we only completed 100 points worth. And at the end of the first iteration, it looked even bleaker. This is a place where the customer's involvement is crucial: the customer can split or simplify stories, move the end date, ask for more people, cancel the project, or do whatever they need to do given the status of the project.

How does a team increase its velocity? If a team feels they're on track to finish early, they'll ask for more work in the middle of the iteration.

Iteration Planning

Going a level deeper, XP uses a similar process for deciding what to do in an iteration.

  1. The customer selects stories for the iteration. (This is subject to "Yesterday's Weather" – scheduling only as many hours as were completed in the previous iteration.)
  2. The team discusses each selected story, and brainstorms tasks that would implement them.
  3. Team members accept and estimate tasks.
  4. The team adjusts and balances the tasks to make sure they fit into the iteration.

The tasks for an iteration are in the development team's purview. It's typical for the customer to be present during this planning, as there will often be questions. These are the two most common approaches to task estimation:

  • No estimates on tasks: the team eyeballs the tasks to make sure none look too large (e.g., not bigger than half a day's work)
  • Hour-based absolute estimates: the number of pair-hours a task is expected to take. Some teams try to estimate individual programmers' velocities; more common is to use an estimate for the team as a whole.

The iteration planning is usually done the first morning of an iteration.

The iteration plan will often change during an iteration as a team learns about tasks they missed, or come up with a better or different way to do things.

What's Interesting About XP's Planning Processes?

  • Extremely short cycles.
  • Use of conversations and tests about stories rather than documents.
  • Use of estimates, velocity, and fixed-length iterations.
  • "Yesterday's Weather" rule adjusting estimates every couple weeks.
  • First iteration as a backbone for the implementation and a driver of risk identification.
  • Commitment to flexibility, e.g., minimizing dependencies so stories can be implemented in arbitrary order.


[Written Nov. 5, 2001. Updated Feb. 27, 2002 due to feedback from Kurt Keutzer and Ivan Tomek. Minor revisions May 27, 2006.]

Review – Agile Estimating and Planning

Agile Estimating and Planning, Mike Cohn. Pearson Education, 2006.
My back-cover review was “Mike Cohn explains his approach to Agile planning, and shows how ‘critical chain’ thinking can be used to effectively buffer both schedule and features. As with User Stories Applied, this book is easy to read and grounded in real-world experience.” Let me add that he also discusses estimation, prioritization, some financial analysis, and monitoring. (Reviewed Jan., 2006)

Twenty Ways to Split Stories

The ability to split stories is an important skill for customers and developers on XP teams. This note suggests a number of dimensions along which you might divide your stories. (Added July, 2009: summary sheet (PDF), French translation (PDF).)

Splitting stories lets us separate the parts that are of high value from those of low value, so we can spend our time on the valuable parts of a feature. (Occasionally, we have to go the other way as well, combining stories so they become big enough to be interesting.) There's usually a lot of value in getting a minimal, end-to-end solution present, then filling in the rest of the solution. These "splits" are intended to help you do that.

The Big Picture

Easier Harder Why
Research Action It's easier to research how to do something than to do it (where the latter has to include whatever research is needed to get the job done). So, if a story is too hard, one split is to spend some time researching solutions to it.
Spike Implementation Developers may not have a good feeling for how to do something, or for the key dimensions on which you might split a story. You can buy learning for the price of a spike (a focused, hands-on experiment on some aspect of the system). A spike might last an hour, or a day, rarely longer. 
Manual Automated If there's a manual process in place, it's easier to just use that. (It may not be better but it's less automation work.) For example, a sales system required a credit check. The initial implementation funneled such requests to a group that did the work manually. This let the system be released earlier; the automated credit check system was developed later. And it was not really throw-away work either – there was always going to be a manual process for borderline scores.
Buy Build Sometimes, what you want already exists, and you can just buy it. For example, you might find a custom widget that costs a few hundred dollars. It might cost you many times that to develop yourself.
Build Buy Other times, the "off-the-shelf" solution is a poor match for your reality, and the time you spent customizing it might have been better spent developing your own solution.

User Experience

Easier Harder Why
Batch Online A batch system doesn't have to interact directly with the user.
Single-User Multi-User You don't face issues of "what happens when two users try to do the same thing at the same time." You also may not have to worry about user accounts and keeping track of the users.
API only User Interface It's easier to not have a user interface at all. For example, if you're testing your ability to connect to another system, the first cut might settle for a unit test calling the connection objects.
Character UI or Script UI GUI A simple interface can suffice to prove out critical areas.
Generic UI Custom UI At one level, you can use basic widgets before you get fancy with their styles. To go even further, something like Naked Objects infers a default user interface from a set of objects.


Easier Harder Why
Static Dynamic It's easier to calculate something once than ensure it has the correct value every time its antecedents change. Sometimes, you can use a halfway approach: periodically check for a needed update, but don't do it until the user requests it.
Ignore errors Handle errors While it's less work to ignore errors, that doesn't mean you should swallow exceptions. Rather, the recovery code can be minimized.
Transient Persistent Let's you get the objects right without the worries about changing the mapping of persisted data.
Low fidelity High fidelity You can break some features down by quality of result. E.g., a digital camera could start as a 1-pixel black-and-white camera, then improve along several axes: 9 pixels, 256 pixels, 10,000 pixels; 3-bit color, 12-bit color, 24-bit color; 75% color accuracy, 90% color accuracy, 95% color accuracy." (William Pietri)
Unreliable Reliable "Perfect uptime is very expensive. Approach it incrementally, measuring as you go." (William Pietri)
Small scale Large scale "A system that works for a few people for moderate data sets is a given. After that, each step is a new story. Don't forget the load tests!" (William Pietri)
Less "ilities," e.g., slower More "ilities" It's easier to defer non-functional requirements. (A common strategy is to set up spikes as side projects to prove out architectural strategies.)


Easier Harder Why
Few features Many features Fewer is easier.
Main flow Alternate flows (Use case terminology.) The main flow – the basic happy path – is usually the one with the most value. (If you can't complete the most trivial transaction, who cares that you have great recovery if step 3 goes bad?)
0 1 Hardware architects have a "0, 1, infinity" rule – these are the easiest three values to handle. Special cases bring in issues of resource management.
1 Many It's usually easiest to get one right and then move to a collection.
Split condition Full condition Treat "and," "or," and "then" and other connector words as opportunities to split. Simplify a condition, or do only one part of a multi-step sequence.
One level All levels One level is the base case for a multi-level problem.
Base case General case In general, you have to do a base case first (to have any assurance that recursive solutions will terminate).


These "splits" may help give you ideas when you're looking for a way to move forward in small steps. While it's important to be able to split stories, don't forget that you have to reassemble them to get the full functionality. But you'll usually find that there is a narrow but high-value path through your system.

[Developed for XP Day speech, Sept., 2005. January 6, 2006: Thanks to William Pietri for sharing his suggestions on the fidelity, reliability, and scale dimensions. Fixed typo, 7-19-06. Added "connectors", 1-8-11.]

Agile ’05 Conference Report

Part 1

The Agile '05 conference was July 24-29, 2005, in Denver, Colorado, USA. There were about ten or twelve tracks at all times, so this report is necessarily only a limited bit. Usually I teach, but this time I was an organizer so I got to be more like an attendee in some ways.

Brian Marick and Bob Martin Keynote: Where were we, where are we, where are we going?

Brian Marick: We have different disciplines. We become multi-specialists. "A team of multi-specialists can knock the socks off a team of specialists." But there's no career path for multi-specialists – what conference would you go to?

Brian invited Jim Highsmith up to announce the formation of the Agile Project Leadership Network (APLN). Its focus is on applying agile principles to all projects, not just software. Relative to the Agile Alliance, this organization has consistent principles and intends to collaborate; it just has a different focus.

Bob Martin: "Value the flow of value."

The software industry: was either no process or waterfall. Now we're recognizing the failure of waterfall and the need to measure in-process work. It's time to do "software at home." Where we're going is to resolve this challenge, with short cycles, feedback, testing, craftsmanship.

The agile community: The Scrum paper in '95, XP in '99: resonated with the community. But we were fractured into many agile methods; branding. Now, industry is learning to pull and mix practices; we're at the moment of convergence. Future: head away from brands, strong focus on project management. At the end, we'll have a definition of agile.

Users: Were small teams with a programming focus; limited acceptance tests, project managers "lost." We are a community that has grown. We see 80-people teams often mixing Scrum and XP. 300-person teams exist. A company of thousands doing transitions. Future: we'll grow. Agile project management will happen. Automated acceptance tests will be regarded as the measure of progress.

Brian Marick: There was a group of English cyberneticians. They made discoveries of performance, adaptation, surprise. But oops… no (single) institutional home, no acknowledged base body of techniques, no grooming of successors. Their tradition is gone. To avoid this, "we gotta take over a university department." He announced the "Gordon Pask award", to honor people whose recent contributions demonstrate their potential to be leaders of the field.

Open Space

There were a lot of topics; see the web site. One was "XP Rituals", where we identified a number of fun things:

  • Breath mints
  • Haiku written on card before starting a task
  • Hacky sack at standup
  • Traffic light for builds
  • Darts when a pair finishes a task
  • "Story starting" gong
  • Story completion frog (noise-maker) or bell
  • Daily progress chart – two thermometers, showing days used and points locked in
  • Automatic break program
  • Smiley stickers for story cards

Metaphors, by Dave West

In 2001, metaphor was listed as an official XP practice (part of architecture); in 2005, it was not. Kent has said its redundant as a practice since you can't help using it. (Dave argues it's a learned skill.)

Metaphor has a lifecycle from poetry through use. Lakoff and Johnson say all thought is based on metaphor. Dave argues:

  • If you don't choose consciously, you get one unconsciously.
  • Metaphor selection has a real effect on software design (architecture).
  • Metaphor is a skill you can develop.
  • Therefore, it should be an agile practice.

We have many default metaphors: machine, org chart, entity, dualism, computer science, software engineering, product manufacturing. There are alternatives: e.g., object as person.

To use metaphors well, learn: reflect, experiment, read widely, liberal arts, point of view.

Tim Lister: XP Denver Invited Talk

  • Managers are lonely – they don't have a collegial environment.
  • QA is a bottleneck, and no wonder: it's done late and in serial rather than in parallel.
  • Getting agreement on requirements is not neat. "Gathering requirements" sounds like they're Easter eggs in the yard. But really, most requirements have to be invented.
  • Problem: people believe the customer.
  • It's messy, so model and prototype. Don't prototype solutions – prototype wrong things on purpose (so they can laugh at it).
  • Estimating is overwhelmed by expectations: skew (never early). Separate goals from estimates.
  • Software in general is uninteresting. "One right way" is anti-intellectual.
  • "Never lose the satisfying feeling of making something valuable, together with your colleagues."


Part 2

Delivering APIs in an Agile Context, John Major

John Major described a project that was doing custom programming of lab workflows as part of the Human Genome Project. This was an environment with lots of churn: biology, instruments, J2EE.


  • Stable APIs vs. agile change
  • General features vs. "Do the simplest thing that could possibly work"
  • Frameworks vs. global refactoring
  • Framework design vs. features

The result was a platform API, a mix of data-driven (configuration) specification and custom code. The team had the attitude toward process, "We'll try anything for a month." There was lots of testing. They used Jemmy with Swing.

Unique practices:

  • Internal support: old and new features. Used the idea of "office hours" for support. Provided training. Tried having platform developers work on applications, but it didn't work so well.
  • Lightweight code reviews. Rather than pairing, they required at least a 20-minute weekly code review with an office-mate.
  • Agile database schemas. Problem: RDBMS is rigid, but schemas evolve. Solution: build agile schema features and tools to manage change.

Lessons learned: Good:

  • Agile practices help keep technical debt low.
  • Build tools to support RDBMS and large codebase.
  • Pull in senior people to help design and adoption.


  • Cost of absorbing API changes is paid by the application teams, but benefits accrue to the business.
  • It's hard to get feature design right (to balance flexibility and focus).
  • The business had trouble understanding the costs of release management. (Branches made the whole thing even crazier; he described a "London subway" map they created to show all the paths.)


  • People issues – don't go into denial.
  • Weren't able to tell QA what the test suites did, so there was overlap between the automated tests and manual testing.
  • Be humble – the platform needs the app and vice versa.


  • Build reusable platform technology
  • Use agile practices to cope with change
  • Work with "eyes wide open"

Part 3

Rachel Davies' and Mike Hill Workshop on Informative Workspaces

Informative workspaces:

  • Team memory
  • Visible status (keep it fresh)
  • Automation – light and sound
  • Hawthorne effect
  • Track "puzzles"
  • Root cause analysis
  • Positive focus

Themes: Ownership: Own the space; collective ownership of communication, accountability.

Transparency: Be honest with yourself – show where you really are; peer pressure; reveal hidden/unknown problems.

Keep it Fresh: Drop stale charts. Let anybody update. Automation (e.g., build server status, test status). May use projects or status monitor.

Intra-/Inter-Team Communication: Visual progress => motivation. Whose team sees it.

Hokey-ness: Lack of ownership. Formality. Ownership – lead team to the charts. Time limit for new things. Cool is fun, but must be cool to team.

Kent Beck Open Space on Renewing the Fire

"The edge of knowing and not knowing" – Troy Frever.

What helps keep the fire? A lot of disucssion on being in the zone, in flow – but that's only part of it. Many people crave novelty, mentoring, flow, living on the edge.

There's a "game face" you put on.

Pollyanna Pixton Open Space on Organizational Change

There's a "practice" level, for developers, managers, product owners. But there's also an organizational change level.

"Continuous retrospective" – collect cards all week.

Leadership "versus" self organization.

Kent Beck Open Space on XP for Beginning Teams

He uses an Appreciative Inquiry approach. Take a practice – what does it mean for you? Acts as a "practice inkblot".

An AI formulation:

  1. Remember a good (peak) situation
  2. Explore the circumstances that made it possible
  3. What is the meaning now?

We broke into pairs and did some mind-mapping of a practice.

Research Paper: A Case Study on the Impact of Scrum on Overtime and Customer Satisfaction – Mann and Maurer

Described a team that originally had variable sprints, then shifted its sprint size from 20 to 11 days at the customers' request. Research tried to address, "Does Scrum provide a sustainable pace?" They found there was less overtime (both mean and variance) after Scrum was introduced in this organization. Customers were more satisfied. (Planning closer to delivery led to less misdirected development; playing with the product in the release cycle let them tweak its direction.) Scrum gave them better control and visibility.

Research Paper: An Environment for Collaborative Iteration Planning (Liu, Erdogmus, Maurer)

They used a traditional-looking table with a projected image, and wireless tablet PCs.The thought was that people would create and edit stories on a tablet, then organize and prioritize stories using finger and pens. This would give real-time information, as well as persistence.

The display was nice. One neat trick was that you could "push" a card toward somebody and it would glide at a realistic-looking rate (but never fall onto the floor:)

Existing tools are either cards, which are easy to work with but lack persistence, or planning software, which creates an unbalanced environment (somebody controls the keyboard) but which does have persistence.

The research effort is just beginning; they want to evaluate, "Is it useful? Is it usable? How does it work compared to existing tools?"

Part 4

Jeff Sutherland on Advanced Scrum

"Better, faster, cooler." If this is interesting, see Jeff's paper. I made very quick notes but it's an interesting extension of the Scrum work and I plan to give it more study.

Scrum is out of development and into the whole company. This approach is for experienced ScrumMasters/developers only. See Lawrence Leach: "Eight Secrets to Supercharge Project Performance", Adv. Project, Inc.

Productivity is measured as "features/$100K"

How can we (agile?) win?

One study shows outsourcing cuts costs by only 20% – significant, but not quite the same as cutting 80% or more of costs as some have promised.

The new approach: anticipatory, requires accurate analysis of process and automatic monitoring.

Type A, B, and C Scrum – A: isolated cycles (breaks between sprints), B: overlapping iterations (no handoffs), C: all at once. Consider the sprint level. Need to do pre-staging work for the next iteration inside this one (or else we'll be tempted to fall back to type A).

Advanced scrum has multiple simultaneous concurrent sprints. All sprints release live software.


  • MetaScrum for release planning
  • Variable-length sprints
  • Overlapping sprints for one team
  • Pre-stage product backlog
  • Daily scrum of scrums
  • Integrate product backlog and sprint backlog
  • Paperless project management and realtime reporting
  • Administrative overhead of 60 seconds/day/developer and 10 minutes/day/ScrumMaster.

One of the big challenges is having items in the backlog be ready. In their case, that means "intuitive for a doctor" – usable within one hour. This means stories need enough detail, and must have been prototyped with doctors. It forces the product owners to work with the developers to prototype things.

MetaScrum: weekly meeting of stakeholders. Led by the Product Owner. Use the three questions. All product decisions are made here. Release status. All decisions are communicated that day. Damage control plans are executed in the same day.

Iterations: three-month – major product releases (eliminate bugs from portfolio); 1-month sprint for new customer/upgrades (eliminate high-priority bugs); one-week sprint for critical issues. This gives them 45 production releases a year. "Assumes good practices. They're using one code base, working at the tip.

Pre-staging: Overlap sprints, with no break between them. Work only on product backlog that is ready to go the sprint. Pre-staging can double throughput in sprints.

Prioritizing sprints. Constraint theory tells us we must have slack. This lets you optimize multiple overlapping sprints. The backlog is fully automated, with priority levels for the different length sprints (week, month, quarter). Developers focus on one task at a time in priority order. Completing weekly/monthly tasks let them get to the "fun" quarterly tasks.

Sprint backlog: Set to embrace change. Every sprint releases. Customer satisfaction is the top priority. Can restart – a MetaScrum decision.

Automated backlog: Via a bug tracking tool. For each task, the developer is asked: 1. For tasks touched today, how much time is invested in the task? 2. What percent is done? This provides enough data for management "micro-accounting".

Part 5

Random notes from Open Space

Overheard: "Whenever I've been late, I've been yelled at by management. One time, we actually pulled it together and finished early. The president called me – not to praise me, but to say, 'You sandbagged me.'"

Is XP Sustainable? Some things people do:

  • Daily: variety
  • Medium-term: cleanup projects
  • Gold cards (explicitly scheduled 'free time')
  • Slack

Rick Mugridge, Domain-Driven Design Patterns and Fit. Examples: queries vs. operations, entities vs. value objects, aggregates (key root objects), repository (turns into query or iterator). Can use a different test harness with the same fixture and same code, but control whether to connect to a "real" database via configuration.

A test writing pattern: create a setup, do some action, repeat the setup but show what has changed.

Norm K erth: The Secrets of Leading from a Position of No Power

He showed clips of the film Gandhi, then debriefed to explore how Gandhi led. He points out that we can learn about leadership from studying history. Gandhi: "Without a journal of some kind, you cannot unite a community."

Some principles: transparency, persistence, courage. An unjust law: don't resist, but don't comply either.

How did he do it? "He decided soemthing needed change–something important, and worthy of his effort." He let go of assumed authority.

Inauthentic power = anointed power – ("I'm powerful because I'm a manager.") – a weak form of power since it goes away if the position does. Authentic power – inside yourself.

Gandhi accepted the cost of punishment. "Better to lose your job for doing something, or for doing nothing?" He was respectful, sought commonality. He knew his support system: the law. He started small, finding people of like mind.

Characteristics of change agents:

  1. Ability to articulate a vision of where you are going (though it can change)
  2. Persistence
  3. Confidence – inner energy for the right thing
  4. Optimism – energy that you lend to someone else so they can gain confidence.

You can cultivate these!

"The most effective coach is sitting next to you, involved from the beginning to the end."

Gandhi: it paid to advertise. Peaceful revolutions are the lasting ones – they're really an evolution. Pick your battles – you don't need to do everything yourself. Know your real mission.

Joshua Kerievsky: Commoditizing Agility

According to Josh, we're right at the edge of Moore's chasm, and need to make it easier to make that move.

He had statistics from a project showing productivity increases by a factor of 2.5, achieved in one year with an XP team.

"The agile community is thriving, but transitions are too slow and expensive." Why? We're not agile enough in transitions. Problem: serialized knowledge transfer. The shift is repetitive and exhausting.

A couple models: 16 days coaching for a 15-person community. Or, inhouse workshop with 3-6 month full-time coaching for a 20-30 person community. Then try to stretch to other communities, transferring experts out and future experts in. The Problem: people are trained technically but not so much in coaching.

The basics are taught manually. This gives inconsistent content and it doesn't scale. Books don't go far enough. Basic knowledge is fragmented. Books require dedicated study.

This marginalizes coaching time: the basics take away from the hard stuff. It burns out the coaches. So we need to commoditize the basics. Experts are in short supply. How many coaches have 3+ years experience? Internal experts are "too indispensable" to share.


"Products and services are so standardized that attributes are roughly the same." A commodity market has declining prices and profit margins, more competition, and lower barriers to entry.

Ian Murdoch: "Open source and the commoditization of software": Commoditization happens eventually. It's unstoppable, but it's good for all. There's a cycle: Innovation leads to standardization [accessible] leads to commoditization (maybe) [affordable] leads to innovation. Innovation builds on a custom platform.

Commoditization can go two ways: Decommoditization is an incompatible innovation, or customization. But be careful with decommoditization – it can leave you out of the innovation loop.

What can be commoditized: basics, stories, planning, chartering, retrospectives, … What can't: advanced practices, specialized things, "living agile values", people issues.

Josh showed an example of Husqvarna's sales training – a sort of animated comic book, with a model of the sales process built in. He showed a demo he made, using a screen capture tool.

Bottom line: Tomorrow we can have parallel processing for the basics, leaving more quality time for advanced topics. We can get quality, speedy, consistent knowledge transfer.

Workshop: TDD and Beyond, Astels and Marick

Brian Marick emphasized customer-facing tests, that help with project communication. We don't want to do all the tests at once. "Doing Fit tests in a blob creates a bottleneck at the customer/QA level. We don't need all tests, we just need one to get started." He had the image of "programmers as baby birds" [feed me].

Rick Mugridge: We need to move toward declarative tests. "Procedural tests are a smell."

Questions: "Do you have to have normal (JUnit) TDD in place before STDD?" "Do you need a strong customer to make it work?"

Roadblocks: Fit is a challenge because you need programmers and testers to make it work.

Fit Unification Summit

Friday (after the conference), a number of Fit implementors met for a Fit unification summit. If you're interested in that, look for the fit-devl mailing list.

[Published in 5 parts, Aug. 23-27, 2005]




Review – Fit for Developing Software

Fit for Developing Software, by Rick Mugridge and Ward Cunningham.

[My bias disclosure – I know both Rick & Ward, I was a reviewer, and I’ve written for their publisher myself. This review is substantially as posted on the agile-testing group.]

Fit (see http://fit.c2.com) is a testing framework that Ward Cunningham developed. A test author writes tests as tables in a document that can be converted to HTML (e.g., Word, Excel, text editor, etc.). The programmers develop fixtures that connect to the system under test. Fit mediates the tests and the fixtures to run the tests, and captures the results. It colors in the document using red/yellow/green to show what happened.

The book is in two halves. The first half is targeted at test authors: from the user perspective, how does Fit work? It covers the basics of tables, fixtures, error handling, and so on. Then it goes into an extended example (several chapters) following a team developing rental software. The first half closes with advice about designing better tests.

The second half is targeted at programmers. While programmers should really read and understand the first half of the book, test authors will probably at most skim this half. This half starts by explaining how to implement various types of fixtures. Then it continues the earlier rental software example by showing the fixture code that would be developed. Finally, this half closes with some advanced topics: Fit’s architecture, custom fixtures and runners, and model- based test generation.

The authors have done a good job explaining Fit from both the test- writing and programming perspectives. The text is clearly written, using plenty of examples, frequent breaks for Questions and Answers along the way, and exercises at many chapter ends.

This book is unique. While you can find information about Fit and fixtures on the web, what’s on the web is much less readable than what this book provides. The book also gives you an extended example and helpful advice from two experts.

If you are considering Fit, or just want to understand its philosophy, this book provides the clearest explanation I’ve seen. For test authors, the first part of the book justifies the whole price. For programmers who need to understand how and why fixtures work, it’s even more of a bargain.

Fit for Developing Software, by Rick Mugridge and Ward Cunningham, with a foreword by Brian Marick. Prentice Hall, 2005. ISBN 0-321-26934-9.

Procedural and Declarative Tests

Procedural tests focus on a series of steps; declarative tests work from the input and states. Procedural tests are natural to write, but consider whether a declarative test expresses your intent more concisely and clearly.

Test Styles

There are two common styles of writing tests. One style might be called procedural: a test consists of a series of steps, each acting on or testing the state of the system. The second style might be called declarative: given a state of the system and some inputs, test the resulting state and the outputs. (These terms are from the theory of programming languages.) If we were using Ward Cunningham's fit testing framework, we might think of these as prototypically ActionFixtures vs. ColumnFixtures.

Let's look at a small, concrete example: a simple login screen.

We'd like to test that the OK button is highlighted only if both a username and a password have been specified.

Here's a test in the procedural style:

1 enter username bob
2 expect OK inactive
3 enter password pw
4 expect OK active
5 clear username  
6 expect OK inactive
7 clear password  
8 expect OK inactive

Here's a test of the same capability in a declarative style:

  Username Password OK active?
1 none none no
2 bob none no
3 none pw no
4 bob pw yes

There's a sense in which the first test is more natural to write: it tells you what to do, step by step. But consider some advantages of the second style:

  • If we want to know whether a particular test case is covered, the second style shows it more directly. For example, what if the username and password are both empty? It's easy to see that this case has been considered in the second test.
  • The declarative style fails or passes one row at a time. The procedural style is vulnerable, in that if a check in the middle fails, the whole rest of the test may be invalid. (This isn't "free" though – the code connecting the test to the system must be designed to make this test independence work.)
  • The declarative style has all the critical state, inputs, and outputs listed explicitly. In the procedural style, you may have to trace the state through many steps.
  • The procedural style tends to presume it knows details about the interface; this makes it more brittle to change.

Keeping State

Real systems may involve hidden state: state that affects the system but is not directly set or seen. Consider this example:

This is an accumulator: it provides a running sum and average. Obviously, it must somehow keep track of n, the number of items entered, in order to compute the average.

We can make a declarative test out of this by exposing the hidden state to the test:

n sum data sum' avg' comment
1 100 100 200 100 normal
9 100 0 100 10 add 0
2 2 0 2 0 integer divide

The first two columns, n and sum, represent existing state. Data is the number entered on the screen, and sum' and avg' are the results.

Getting to the hidden state can be a trick. Sometimes the programmers can explicitly expose it as part of a testing interface; other times the state can be accessed by a sequence of steps. In this case, we could setup the proper state for each row by doing this:

Hit the clear button
Repeat n-1 times: add 0
Add sum
[Optional: check sum]

Then the rest of the row might be interpreted as:

Add data
Check sum
Check avg

This puts some burden on the test setup code. But if the setup is not done there, it's probably repeated in a bunch of scripts.

When to Use the Procedural Style?

Procedural style pays off best when the sequential nature of the feature is what's interesting: when how you got somewhere is as important as where you are. Consider the problem of multiple selection, using click, shift, control, and drag. For setup, imagine a vertical list of items, numbered 1 through 10. Consider this test:

1 click 5  
2 release    
3 expect selected 5
4 expect last 5

Or this one:

1 click 5  
2 drag to 7  
3 drag to 3  
4 release    
5 control    
6 click 7  
7 expect selected 3,4,5,7
8 expect last 7

You could perhaps develop a set of declarative tests for these, but the sequence of actions is what's interesting.

What to do?

Leverage procedural tests for the cases where the sequence of actions is paramount, and there's not a lot of commonality between scripts. Scenario tests can benefit from the procedural style.

A declarative test is like a table you could put in a user manual: it concisely and concretely explains a function. Declarative tests sometimes require extra setup, especially when hidden state is involved, but they're often worth that extra trouble. Declarative tests are good for testing permutations of a business rule.

If all else is equal, favor the declarative style.

These guidelines can boost your test-writing efficiency: they move repetitive actions into the test setup, and let you focus on the interesting part of a test.

[Written February, 2005. Brian Marick suggested the terms, though I'm not sure I picked the set he liked best.]

Pair Practice

Goal: Get a feeling for pair programming (without any programming).

Time: 20-30 minutes.

Introduction: Pair programming is two programmers working together at one terminal on one task.

Think of a good pair driving across the country. One will drive, the other navigate (thinking tactically and strategically). They'll take turns. They'll pay attention to each other's mood. They'll take a break when they need to. A good programming pair is like this.

This exercise conveys some of the feel of what it's like to be a programming pair.


  • List of tasks.
  • Two skill cards (face down).
    Card A: Good at spinning left. Bad at head patting.
    Card B: Good at head/shoulders/knees/toes, bad at chrysanthemum.


  • Divide into pairs; decide who is programmer 1 and who is programmer 2.
  • Give each programmer a skill card to memorize; a programmer shouldn't show their skills to the other programmer. (Thus, if programmer 1 has card A, programmer 2 will have card B.)


  • If you're the actor on a task: close your eyes, do the task, then stand normally, and say, "OK."
  • Keep the task list out of reach and out of "reading range" for the actor.
  • Work through the list, with one or the other programmer being the actor doing each task.
  • If a task is on the non-actor's "good skill" list, that person should say "May I drive?" and do the task.
  • If a task is on the actor's "bad skill" list, that person should say "Would you drive?" and let the other do the task.
  • Don't be the actor for more than three tasks in a row.


  • Does one programmer drive the whole time? [No.]
  • Does the partner provide strategic guidance? [Yes; in this example, they hopefully read the tasks to the actor.]
  • Did the partner check that activities were done, and done right? [Perhaps even checking off the list.]
  • Did the "may I drive/would you drive?" questions feel awkward? [Probably – but you get used to it – and you can use a different phrase.]
  • Is the pair cost-justified? [Did the pair go more quickly and accurately together? Would they in general?]
  • Would you be faster on your own? [Remember the constraint of keeping the paper several feet from the actor.] What if your "bad" skills took ten times as long and your good skills were ten times faster than the average?

Task List:

  1. Stand on one foot for seven seconds.
  2. Put your arms straight over your head and count to three.
  3. Touch your head, then shoulders, knees, and toes.
  4. Spin once around to the left.
  5. Touch your eyes and ears and mouth and nose.
  6. Pat your head.
  7. Rub your stomatch.
  8. Spin once around to the right.
  9. Pat your head and rub your stomach at the same time.
  10. Touch your nose and say "chrysanthemum."
  11. Clap your hands twice.
  12. Open your eyes and give your partner a high five – you're done.

William C. Wake

INVEST in Good Stories, and SMART Tasks


XP teams have to manage stories and tasks. The INVEST and SMART acronyms can remind teams of the good characteristics of each.

In XP, we think of requirements of coming in the form of user stories. It would be easy to mistake the story card for the "whole story," but Ron Jeffries points out that stories in XP have three components: Cards (their physical medium), Conversation (the discussion surrounding them), and Confirmation (tests that verify them).

A pidgin language is a simplified language, usually used for trade, that allows people who can't communicate in their native language to nonetheless work together. User stories act like this. We don't expect customers or users to view the system the same way that programmers do; stories act as a pidgin language where both sides can agree enough to work together effectively.

But what are characteristics of a good story? The acronym "INVEST" can remind you that good stories are:

  • I – Independent
  • N – Negotiable
  • V – Valuable
  • E – Estimable
  • S – Small
  • T – Testable


Stories are easiest to work with if they are independent. That is, we'd like them to not overlap in concept, and we'd like to be able to schedule and implement them in any order.

We can't always achieve this; once in a while we may say things like "3 points for the first report, then 1 point for each of the others."

Negotiable… and Negotiated

A good story is negotiable. It is not an explicit contract for features; rather, details will be co-created by the customer and programmer during development. A good story captures the essence, not the details. Over time, the card may acquire notes, test ideas, and so on, but we don't need these to prioritize or schedule stories.


A story needs to be valuable. We don't care about value to just anybody; it needs to be valuable to the customer. Developers may have (legitimate) concerns, but these framed in a way that makes the customer perceive them as important.

This is especially an issue when splitting stories. Think of a whole story as a multi-layer cake, e.g., a network layer, a persistence layer, a logic layer, and a presentation layer. When we split a story, we're serving up only part of that cake. We want to give the customer the essence of the whole cake, and the best way is to slice vertically through the layers. Developers often have an inclination to work on only one layer at a time (and get it "right"); but a full database layer (for example) has little value to the customer if there's no presentation layer.

Making each slice valuable to the customer supports XP's pay-as-you-go attitude toward infrastructure.


A good story can be estimated. We don't need an exact estimate, but just enough to help the customer rank and schedule the story's implementation. Being estimable is partly a function of being negotiated, as it's hard to estimate a story we don't understand. It is also a function of size: bigger stories are harder to estimate. Finally, it's a function of the team: what's easy to estimate will vary depending on the team's experience. (Sometimes a team may have to split a story into a (time-boxed) "spike" that will give the team enough information to make a decent estimate, and the rest of the story that will actually implement the desired feature.)


Good stories tend to be small. Stories typically represent at most a few person-weeks worth of work. (Some teams restrict them to a few person-days of work.) Above this size, and it seems to be too hard to know what's in the story's scope. Saying, "it would take me more than a month" often implicitly adds, "as I don't understand what-all it would entail." Smaller stories tend to get more accurate estimates.

Story descriptions can be small too (and putting them on an index card helps make that happen). Alistair Cockburn described the cards as tokens promising a future conversation. Remember, the details can be elaborated through conversations with the customer.


A good story is testable. Writing a story card carries an implicit promise: "I understand what I want well enough that I could write a test for it." Several teams have reported that by requiring customer tests before implementing a story, the team is more productive. "Testability" has always been a characteristic of good requirements; actually writing the tests early helps us know whether this goal is met.

If a customer doesn't know how to test something, this may indicate that the story isn't clear enough, or that it doesn't reflect something valuable to them, or that the customer just needs help in testing.

A team can treat non-functional requirements (such as performance and usability) as things that need to be tested. Figure out how to operationalize these tests will help the team learn the true needs.


For all these attributes, the feedback cycle of proposing, estimating, and implementing stories will help teach the team what it needs to know.


There is an acronym for creating effective goals: "SMART" –

  • S – Specific
  • M – Measurable
  • A – Achievable
  • R – Relevant
  • T – Time-boxed

(There are a lot of variations in what the letters stand for.) These are good characteristics for tasks as well.


A task needs to be specific enough that everyone can understand what's involved in it. This helps keep other tasks from overlapping, and helps people understand whether the tasks add up to the full story.


The key measure is, "can we mark it as done?" The team needs to agree on what that means, but it should include "does what it is intended to," "tests are included," and "the code has been refactored."


The task owner should expect to be able to achieve a task. XP teams have a rule that anybody can ask for help whenever they need it; this certainly includes ensuring that task owners are up to the job.


Every task should be relevant, contributing to the story at hand. Stories are broken into tasks for the benefit of developers, but a customer should still be able to expect that every task can be explained and justified.


A task should be time-boxed: limited to a specific duration. This doesn't need to be a formal estimate in hours or days, but there should be an expectation so people know when they should seek help. If a task is harder than expected, the team needs to know it must split the task, change players, or do something to help the task (and story) get done.


As you discuss stories, write cards, and split stories, the INVEST acronym can help remind you of characteristics of good stories. When creating a task plan, applying the SMART acronym can improve your tasks.

[I developed the INVEST acronym, and wrote this article in April and August, 2003. Thanks to Mike Cohn for his encouragement and feedback.]

Related Articles


Postscript – Added 2-16-2011

I've been asked a few times where the INVEST acronym came from. I consciously developed it: I sat down and wrote every attribute I could think of applying to good stories: independent, small, right-sized, communicative, important, isolated, etc. This gave me a page full of words. Unfortunately, I haven't kept that list.

Then I grouped the words into clusters:

  • Isolated, independent, separate, distinct, non-overlapping, …
  • Important, valuable, useful, …
  • Discrete, triggered, explicit, …

The categories were a little fuzzy; I had about ten.

I identified "centers"; words that captured the essence of their category. Some clusters had two or three plausible centers. That was ok, as I just wanted their first letter for the acronym.

Then it was scramble time: take the three or four clusters that had to be there, plus some of the less important ones, and scramble the initials to find a word that fit. I wanted a word that had 4-6 letters, with no repeats.

I tried a lot of combinations. I remember at one point that if I had a G I could make VESTIGE or VESTING. Having VESTIN_, I realized I could turn it around to get INVEST, which sounded much better than VESTIGE:) So INVEST won, and it's been popular enough that I'm sure I made the right choice.

Fit Spreadsheet

Ward Cunningham has created an acceptance testing framework known as fit. (See http://fit.c2.com for more details.) In this brief experiment, we'll use tests to help specify a simple spreadsheet for strings.

Starting Fit

To use fit, you create a web page that has tables in it; the tables specify tests. (There are other options but that is easiest.) In this case, I'm using Microsoft Word(tm) and saving the file in HTML format.

The fit FileRunner acts as a filter: given a web page, it copies text outside of tables as is, and runs your program on the table entries. Some table entries represent tests that can pass or fail; fit colors them green or red respectively. The output is another HTML file.

Fit will also put a summary in the file if you put in a table like this:


With this tool, you don't manipulate screen elements directly. Instead, you work with an abstraction of them. To me, it feels like talking to somebody over the phone, trying to tell them how to use an application. ("In cell cee seventeen, put equals a one; then go to a one and type 'fish'.")

This article shows the input to fit; the result of running it is here.

Programming and Configuration Notes

Fit is a tool for customers and testers, but programmers will use it as well, and will have to write some of the fixtures the team uses. In this paper, I've tried to use the framework mostly straight out of the box.

The CLASSPATH needs to include fit.jar (both in the DOS window and the IDE). The runner command I'm using is:

java fit.FileRunner FirstFit-in.htm FirstFit-out.htm

When I do this on the file I have so far, it creates the output file and writes this to the console:

0 right, 0 wrong, 0 ignored, 0 exceptions


Tables in the input file have the name of a fixture in the first row. A fixture is a class that knows how to process the table. Fit comes with several fixtures built in, and programmers can create others.

One simple fixture is the ColumnFixture. In this fixture, the first row is the fixture name, and the second row has the names of data. If a name ends without parentheses, it is regarded as a field to fill in; with parentheses, it's treated as a method (function) call. The fixture fills in all the data fields, and then calls the methods to verify that they return the expected results.

Another standard fixture is the ActionFixture. This one consists of a series of commands. These include:

  • start classname: Creates an object of the specified class
  • enter field value: Sets the field to the value
  • press button-name: Calls the method corresponding to the button
  • check method value: Checks that the method returns the expected value

The ActionFixture ignores anything past the first three columns; we'll use the fourth column for comments.

So, we're finally ready to start our application.

start Spreadsheet Create a new spreadsheet.

This test doesn't ask for much, but of course it fails. (There isn't any code yet!)

            0 right, 0 wrong, 0 ignored, 1 exceptions

Programmer Notes

The exception is thrown because the Spreadsheet object doesn't exist. To create it as simply as possible, make it extend Fixture:

import fit.Fixture;

public class Spreadsheet extends Fixture {}

This gets us back to

            0 right, 0 wrong, 0 ignored, 0 exceptions

I've put together stubs for the fixtures used in this article: Spreadsheet.java, SpreadsheetFormula.java, and Address.java; here's a zip file containing all three.

A Few Stories

We have several things we want our spreadsheet to do:

  • Track the contents of cells
  • Distinguish data from formulas
  • Provide both data and formula views of cells
  • Support "+" for appending strings, "'" for reversing strings, "()" for grouping, and ">" for string containment.


The spreadsheet has a number of cells, each of which has an address. Cells contain string data or formulas.

We'll assume several screen elements:

  • a1 – the cell "A1". For "enter," we'll put something in the cell; for "check," we'll get its displayed value.
  • b1 – the same for cell "B1".
  • formula – the formula of the last-mentioned cell.

We'll start with a simple data cell.

fit.ActionFixture Comments
start Spreadsheet    
enter a1 abc  
check a1 abc Text in cell
check formula abc Formula is same. (Looks in last-mentioned cell.)

Now let's add in a formula cell. (Note that this table omits the "start" line; this means it's working on the same object as before. This lets us not repeat the setup, but it also makes the tests less independent.)

fit.ActionFixture Comments
enter a1 abc  
enter b1 =A1 Simple copying formula
check formula =A1 Formula is there
check a1 abc Original text in A1
check b1 abc Text was copied to B1

The essence of a spreadsheet is the automatic updates. Let's change A1 and see it happen.

fit.ActionFixture Comments
enter a1 abc  
enter b1 =A1 Simple copying formula
check b1 abc Copied value
enter a1 revised Update A1
check b1 revised Automatically updates B1

We already have quite a few elements in use, though we haven't specified exactly what is valid. Let's just note the "specification debt" and move on.

  • What can a cell hold? Empty string, other string, formula starts with "="
  • What's a valid address? Letter plus digits; ignore leading 0s; case-insensitive.
  • What's a valid formula? So far, we've just used a simple cell reference, but we want operators too.
  • What happens when a cell has an invalid formula?
  • What happens when a cell refers to a cell containing a formula?
  • What happens when formulas form a loop?

We'll pursue all these, but let's start with formulas.


Formulas can reference formulas. We'll use a new ColumnFixture, SpreadsheetFormula, that lets us specify the inputs and expected outputs of cells. This fixture should access the same type spreadsheet as used by Spreadsheet.

a1 b1 c1 d1 a1() b1() c1() d1()
data =A1 =B1 =C1 data data data data

Formulas get more interesting when there are operators available. The reverse operator (') is probably a good one to start with.

a1 b1 b1()
abc =A1' cba
abc =A1'''' abc

The most useful string operator is probably append (+). Fit ignores input cells that are left blank, so we'll explicitly use the word "blank" when we want an empty cell. The fixture will have to take this into account.

a1 b1 c1 b1() c1()
abc =A1+A1 blank abcabc  
abc def =A1+B1+B1+A1 def abcdefdefabc

We have enough features that we can demonstrate an identity: (XY)'=Y'X'. We don't have parentheses yet, but we can simulate this by putting the parts in separate cells.

a1 b1 c1 d1 e1 d1() e1()
abc xyz ignored =A1+B1 =D1' abcxyz zyxcba
abc xyz =B1' =A1' =C1+D1 cba zyxcba

Parentheses can be used to group operators. Let's re-do the previous test, allowing parentheses:

a1 b1 c1 c1()
abc xyz =(A1+B1)' zyxcba
abc xyz =B1'+A1' zyxcba

The operator ">" tells whether one string contains another one. If the first string contains the second, the result is the second. If the first string doesn't contain the second, the result is an empty string.

a1 b1 c1 c1()
banana ana =A1>B1  ana
banana bab =A1>B1  

We haven't talked about precedence yet. The ' and () operators have the highest precedence, then +, then >. A1+B1+C1 is a legal expression, but A1>B1>C1 is not.

a1 b1 c1 c1()
abc xyz =A1+B1' abczyx
abc xyz =(A1+B1)' zyxcba


a1 b1 c1 d1 e1 e1()
abcdef ghijkl e hgf =A1+B1>C1+D1' efgh

Filling in the Gaps

We have several questions left open:

  • What can a cell hold? Empty string, other string, formula starts with "="
  • What's a valid address? Letter plus digits; ignore leading 0s; case-insensitive.
  • What happens when a cell has an invalid formula?
  • What happens when formulas form a loop?

The previous tests made a quick pass through the system. I think of them as generative: they help define the essence of the system. But questions like the above require us to fill in the gaps. I think of tests that do things like check "corner cases," error cases, and how features interact as elaborative; they fill in what we already have. They might find problems, but they may well work already, depending on how the system was built.

What a cell holds

We already have test cases where a cell holds a string, and where a cell holds a formula, but it would be prudent to check that the operators work correctly on empty strings. If e is the empty string and x is a non-empty string, we expect:

            e' = e

As I go to write the test, I realize that we never specified what a cell starts with. The answer, of course, is the empty string. So we'll rely on that: A1 will be empty.

fit.ActionFixture Comments
start Spreadsheet    
check a1   Verify that cell starts empty.

  Then we can verify those rules about working with the empty string:

a1 b1 c1 c1() Comment
blank blank =A1' blank e'=e
blank blank =A1+A1 blank e+e=e
blank blank =A1>A1 blank e>e=e
blank abc =A1+B1 abc e+x=x
blank abc =B1+A1 abc x+e=x
blank abc =A1>B1 blank e>x=e
blank abc =B1>A1 blank x>e=e

  Valid Addresses

There are two places we use addresses: in the address field and in the cells with formulas. When we get a "real" (graphical) interface, the address will mostly be implicit. But even so, we'll test it here just to be safe.

Let's introduce a new fixture, Address. It will be a ColumnFixture: we'll put address in one column, valid() in another, and standardized() in another. (A programmer will have to write the new fixture for us.)

The rules are: a valid address is a letter (A-Z, a-z) followed by one or more digits (0-9). Case is ignored. Leading 0s are ignored. "0" is not a valid row number.

address valid() standardized()
A1 true A1
a1 true A1
A9874 true A9874
Z1 true Z1
z1 true Z1
Z3992 true Z3992
z3992 true Z3992
AA393 false  
zX202 false  
é17 false  
1 false  
~1 false  
~D1 false  
y&1 false  
^ false  
X392% false  
H001 true H1
j00010 true J10
e000 false  
A0 false  
z0 false  

Let's make sure that case-insensitivity works in formulas:

a1 b1 b1()
abc =A1+a1 abcabc

Formula Errors

If a formula contains an error, we'd like it to display as "#error." We'll put all the invalid names from the previous table into formulas, and verify that formulas behave correctly. Then we'll try various improper combinations of operators.

start Spreadsheet Create a new spreadsheet.
enter a1 =AA393 Bad address
check a1 #error Marked as error
check formula =AA393 Formula as written
enter a1 =A2 Change to valid address
check a1   Make sure #error is cleared 


a1 a1() Comment
=zX202 #error Two letters
=é17 #error Non-ASCII
=1 #error No letters
=~1 #error No letters
=~D1 #error Unacceptable character
=y&1 #error Extra character
=^ #error No letters/digits
=e000 #error Too many digits
=A0 #error Invalid row #
=z0 #error Invalid row #
= #error Missing formula

 Then we'll get to some operators:

a1 a1()


='A2 #error '  should be postfix
='A2' #error Can't be before and after
=A2+ #error Need other term
=A3+A4+ #error Need other term
=A2++A3 #error Missing term
=A2+'+A3 #error ' isn't a term
=A2'''+A3 blank OK to mix things
=A2) #error Missing (
=(A2 #error Missing (
=((((((((((((A2)))))))))))) blank OK – big expression
=((((((A2+(A3))))+A4) #error Unbalanced – too few )
=(((A2>A3 #error Unbalanced – too few )
=(A2>A3))) #error Unbalanced – too many )
=A2>A3> #error Can't trail >
=A2>A3>A4 #error Can't repeat >


If a formula uses itself (directly or indirectly), we don't want it to loop forever trying to figure it out. Instead, we'd like the display to be "#loop."

a1 b1 c1 d1 e1 a1() e1()
=A1 blank blank blank blank #loop blank
=B1 =C1 =F1+D1 =E1 no-loop no-loop no-loop
=B1 =C1 =F1+D1 =E1 =A1 #loop #loop


This paper has demonstrated a set of tests using the fit acceptance testing framework. Some things to note:

  • The tests here have been written as if a customer specified them, without much demonstration of the programming cycle. But programmers can work with these tests in much the way they would with JUnit.
  • The tests are written without benefit of the feedback of a working system. (I wrote just enough code to make the first test not throw an exception.) When I went back to implement the system, I found a number of bugs in the tests.
  • The tests look at only part of the system: the core functionality. There are other aspects of a real application that we aren't testing. (For example, it may be non-trivial to connect a screen to the core code.)
  • Even a small application such as this requires a fairly large set of tests. With more programming work on the fixtures, we might be able to reduce some of the noise. Real applications will organize tests into multiple files, and will have to pay more attention to the challenges of consistency, test independence, and feature interaction.
  • It feels smooth to mix light natural-language specification with formal, executable tests.
  • Fit has a number of features we haven't used.

I've heard that many teams use xUnit for unit testing, but still struggle to get customer tests before or even after stories are implemented. I hope frameworks such as fit can help lower the barriers to doing this crucial task. 



Resources and Related Articles

[Written April 20, 2003; revised April 26, 2003, to correct mis-stated identity & in response to Ward Cunningham's great suggestions about improving the fixtures. Revised May 1, 2003 to fix some test problems. 2012 – the WordPress version is designed to simulate the original look.]

Extreme Programming Explored


Extreme Programming Explored Extreme Programming Explored, by William C. Wake. Addison-Wesley, 2001. Foreword by Dave Thomas (The Pragmatic Programmer).

This book grew out of the XPlorations series of articles. I wrote them as I was learning XP, and relating it to my own experience and practices.


The best version is the book itself. It reflects the feedback of reviewers and editors. You can purchase it somewhere like Amazon.com.

The XPlorations series continues to grow. 

Table of Contents


Chapter 1. Introducing XP…………………………………….1
Programming, team practices, and processes.

Section 1: Programming

Chapter 2. How do you program in XP?…………..11
XP uses incremental, test-first programming.

Chapter 3. What is refactoring?………………………….29
"Refactoring: Improving the design of existing code."
–Martin Fowler

Section 2: Team Practices

Chapter 4. What are XP’s team practices?………51
We’ll explore these practices and their alternatives.

Chapter 5. What’s it like to program in pairs?..65
Pair programming is exhausting but productive.

Chapter 6. Where’s the architecture?………………..77
Architecture shows up in spikes, the metaphor, the first iteration, and elsewhere.

Chapter 7. What is the system metaphor?………..87
"The system metaphor is a story that everyone–customers, programmers, and managers–can tell about how the system works."
–Kent Beck

Section 3: Process

Chapter 8. How do you plan a release?
What are stories like?
Write stories, estimate stories, and prioritize stories.

Chapter 9. How do you plan an iteration?……..115
Iteration planning can be thought of as a board game.

Chapter 10. Customer, Programmer, Manager:
What’s a typical day?
Customer: questions, tests, and steering;
Programmer: testing, coding, and refactoring; [without cube]
Manager: project manager, tracker, and coach.

Chapter 11. Conclusion………………………………………..143

Chapter 12. Annotated Bibliography………………….145

Simplified Iteration Planning

A simplified approach to iteration planning.


XP teams generally use relative story points along with the "Yesterday's Weather" rule for story planning, but there's more variety in how they do iteration planning:

  • Relative estimates (typically as one to three "task points")
  • Absolute estimates (in days or hours)
  • No task estimates (but using a shared sense of when a task is "too big")

This article describes a simplified approach to iteration planning that omits task estimates.


  1. The team reports the number of story points completed last iteration.
  2. The Customer selects that same number of story points for the next iteration. (This is known as the "Yesterday's Weather" rule.)
  3. The Customer prioritizes the selected stories high to low.
  4. Find a blank area to for the task list (whiteboard, large piece of paper, …).
  5. Working from highest priority story to lowest:
    1. Read the story; write its name and estimate on the task list.
    2. Brainstorm the tasks that will accomplish this story. Keep them short (about half a day for a pair is a good size). It's preferred that the Customer participate in this brainstorming.
    3. Write down the tasks you decide on underneath the story.
    4. Add a task "Verify story is complete." (This could include verifying tests, checking off the story, walking through it with the customer, etc.)
    5. Leave some blank space before the next story.
  6. Eyeball the tasks to make sure it looks feasible to complete the planned stories. If necessary, adjust a story's estimate, and get the Customer to drop or split stories to reach the planned number of points.

During the Iteration

  • If you have no active task, sign up for the next task.
    • "Next" is any task from the unfinished story that has the highest priority.
    • If the task breakdown makes it hard to do a task on the current story without stepping on somebody else's feet, it's acceptable to pick a task from the next most important story; but work on your ability to break down stories into nearly independent tasks.
    • Sign up as individuals rather than pairs. (Some teams do the opposite but I like pairs to feel more fluid.)
  • When you complete a task, cross it off and select another one.
  • If you need to split a task or if you realize that some other task needs to be included for this story, add it in the spare blank space. Make sure others are aware of what you've done.
  • Have a set time, about halfway through the iteration, for a "sanity check": make an assessment of whether you're on track to finish the stories planned for this iteration. Shuffle around people or tasks as needed to make this happen, or let your Customer know if you expect to need them to split, drop, or add stories.
  • Maintain a high level of team communication: shared workspace, pairing, standup meetings, talking to the whole room, asking for help, retrospectives, and so on.


  • Developers are signed up for only one task at at time.
  • The highest priority story is being worked on at any time.
  • Relatively small tasks are less likely to be bigger than expected, and may let a team apply more pairs to a story concurrently.
  • Stories are completed throughout the iteration (rather than all at once in a "little bang" at the end of the iteration).
  • If there's too much work for the iteration, or if a new story comes in, the lowest priority story probably has had no work done on it.
  • An explicit "verify story" task helps make sure the tasks add up to the whole story (and gives an opportunity to make sure the Customer agrees).
  • The team's focus is on story completion.

Comparison to "Classic" Iteration Planning

Iteration planning as described in most of the XP books (including Extreme Programming Explained, Extreme Programming Explored, Planning Extreme Programming, etc.) is a little more "front-loaded" than in the simplified form described above:

  • Classic iteration planning has both story points and task points; simplified planning uses story points only.
  • Classic iteration planning in Extreme Programming Explained uses task cards and load factor; simplified planning uses text on a chart. (I don't know of any teams that still use the "white book" approach.)
  • Classic iteration planning talks about individual velocities; simplified planning doesn't address that.
  • Classic iteration planning allows for a tracker role, to help figure out whether individuals, tasks, and stories are on track; simplified planning has less need for a tracker.
  • Classic iteration planning has no slack to allow for new tasks coming in; simplified planning accepts that new tasks will arrive, so it doesn't commit people past their current task.
  • Classic iteration planning relies on the team's and the individuals' ability to balance the load based on their guess at the beginning of the iteration; simplified planning balances the load during the iteration.

The scales aren't tipped fully one way:

  • Simplified planning relies more heavily on communication during the iteration.
  • Simplified planning doesn't give direct feedback on individuals' velocity.
  • Simplified planning prefers smaller tasks.


This article has described a simplified form of iteration planning. By keeping tasks small, maintaining team communication, and focusing on the highest priority story first, iteration planning can become easier and the team can be more responsive.

Resources and Related Articles

[Written September, 2002. Revised Sept. 8, 2002 in response to Mike Clark's suggestion.]