Tag Archives: programmer

Review: Structured Programming (Dahl, Dijkstra, and Hoare)

Structured Programming Structured Programming, by O.-J. Dahl, E.W. Dijkstra, and C.A.R. Hoare. Academic Press, 1972. 

This year (2012) is the 40th anniversary of this text, but it holds up well. It consists of three essays:

  • "Notes on Structured Programming" by E.W. Dijkstra
  • "Notes on Data Structuring" by C.A.R. Hoare
  • "Hierarchical Program Structures" by O.-J. Dahl and C.A.R. Hoare

If you've been led to think "Structured Programming = No GOTOs", the first essay will change your mind. It's much more a consideration of design in the small than a focus on surface form. 

"Data Structuring" considers how to structure types; Pascal's type structure (remember that?) of records, sets, enumerations etc. clearly embodies some of these ideas. You can see roots of the "object-based" approach (that never seemed to make it to the mainstream in the way object-oriented approaches did). 

The final essay describes mechanisms from SIMULA 67, an early (if not the original) object-oriented programming language. You can see the early consideration of objects and coroutines, classes and subclasses. 

There's definitely a "historical" air to these essays, but they hint at a path almost taken. Modern libraries and the modern rush to deliver let us ignore many ideas about software design, but they don't go away. These articles take a slower, more mathematical path than I've seen anybody apply in practice, but it brings clarity of thought about mapping problems and solutions that I'd like to imitate. 

3A – Arrange, Act, Assert

Some unit tests are focused, other are like a run-on sentence. How can we create tests that are focused and communicate well?

What's a good structure for a unit test?

3A: Arrange, Act, Assert

We want to test the behavior of objects. One good approach is to put an object into each "interesting" configuration it has, and try various actions on it. 

Consider the various types of behaviors an object has:

  • Constructors
  • Mutators, also known as modifiers or commands
  • Accessors, also known as queries
  • Iterators

I learned this separation a long time ago but I don't know the source (though my guess would be some Abstract Data Type research). It's embodied in Bertrand Meyer's "command-query separation" principle, and others have independently invented it.

With those distinctions in mind, we can create tests:

Arrange: Set up the object to be tested. We may need to surround the object with collaborators. For testing purposes, those collaborators might be test objects (mocks, fakes, etc.) or the real thing.

Act: Act on the object (through some mutator). You may need to give it parameters (again, possibly test objects).

Assert: Make claims about the object, its collaborators, its parameters, and possibly (rarely!!) global state. 

Where to Begin?

You might think that the Arrange is the natural thing to write first, since it comes first.

When I'm systematically working through an object's behaviors, I may write the Act line first. 

But a useful technique I learned from Jim Newkirk is that writing the Assert first is a great place to start. When you have a new behavior you know you want to test, Assert First lets you start by asking "Suppose it worked; how would I be able to tell?" With the Assert in place, you can do what Industrial Logic calls "Frame First" and lean on the IDE to "fill in the blanks." 

FAQ

Aren't some things easier to test with a sequence of actions and assertions?

Occasionally a sequence is needed, but the 3A pattern is partly a reaction to large tests that look like this:

  • Arrange
  • Act
  • Assert
  • Act
  • Assert
  • Arrange more
  • Act
  • Assert

To understand a test like that, you have to track state over a series of activities. It's hard to see what object is the focus of the test, and it's hard to see that you've covered each interesting case. Such multi-step unit tests are usually better off being split into several tests.

But I won't say "never do it"; there could be some case where the goal is to track a cumulative state and it's just easier to understand in one series of calls. 

Sometimes we want to make sure of our setup. Is it OK to have an extra assert?

Such a test looks like this:

  • Arrange
  • Assert that the setup is OK
  • Act
  • Assert that the behavior is right

First, consider whether this should be two separate tests, or whether setup is too complicated (if we can't trust objects to be in the initial state we want). Still, if it seems necessary to do this checking, it's worth bending the guideline.

What about the notion of having "one assert per test"?

I don't follow that guideline too closely. I consider it for two things: 

  1. A series of assertions may indicate the object is missing functionality which should be added (and tested). The classical case is equals()It's better to define an equals() method than (possibly create and) repeat a bunch of assertions about held data.
  2. A series of similar assertions might benefit from a helper (assertion) method.

(If an object has many accessors, it may indicate the object is doing too much.)

When a test modifies an object, I typically find it easiest to consider most accessors together. 

For example, consider a list that tracks the number of objects and the maximum entry. One test might look like this:

    List list = new List();
    list.add(3);
    assertEquals(1, list.size());
    assertEquals(3, list.max());

That is, it considers the case "what all happens when one item is inserted into an empty list?" Then the various assertions each explore a different "dimension" of the object.

What about setup?

Most xUnit frameworks let you define a method that is called before each test. This lets you pull out some common code for the tests, and it is part of the initial Arrange. (Thus you have to look in two places to understand the full Arrange-ment.)

What about teardown?

Most xUnit frameworks let you define a method that is called after each test. For example, if a test opens a file connection, the teardown could close that connection.

If you need teardown, use it, of course. But I'm not adding a fourth A to the pattern: most unit tests don't need teardown. Unit tests (for the bulk of the system) don't talk to external systems, databases, files, etc., and Arrange-Act-Assert is a pattern for unit tests. 

History

I (Bill Wake) observed and named the pattern in 2001. "Arrange-Act-Assert" has been the full name the whole time, but it's been variously abbreviated as AAA or 3A. Kent Beck mentions this pattern in his book Test-Driven Development: By Example (p. 97). This article was written in 2011. Added a description of Assert First and Frame First due to Brian Marick's comment. [4/26/11] 

Review – Pragmatic Guide to Git (Swicegood)

Pragmatic Guide to Git Pragmatic Guide to Git, by Travis Swicegood. Pragmatic Bookshelf, 2010.
 
I'm using git for the first time on a small project with a friend, and wanted a quick focused handbook to help with that. This book fills that bill. The guts of the book is a series of short descriptions, followed by concrete sets of commands to demonstrate how to make that work. Most of the time, the command reference has just what I'm looking for. (I've still got some blind spots on the tool but I won't blame the book for that; now that I have a little experience, I'll go back through some of the more expository material.)

Review – Beginning iPhone 3 Development

Beginning iPhone 3 Development

Beginning iPhone 3 Development: Exploring the iPhone SDK, by Dave Mark and Jeff LaMarche. Apress 2009.

I'm a former NeXT programmer who hasn't programmed the Mac since before Apple pulled in the NeXT development kit. The iPhone environment looks very similar to the old NeXT environment, except with a lot of new libraries for the cool stuff. 

Mark and LaMarche start from the basics, and cover the key ideas and widgets: outlets, actions, and controllers; views, tab bars, tables, and more.

I'm still a beginner, but I feel like this has given me a good foundation to push further. 

Tests from a Hat

How does the order of tests affect the design of software?

Exercise:

  • Write a bunch of plausible unit test descriptions.
  • Slice them into strips and put them in a hat.
  • Repeat until the hat is empty:
    • Pull tests one at a time from a hat.
    • Write the indicated test
    • Write the code to make the last test pass, TDD style.
    • If you absolutely must create a prerequisite test, do so.
    • Refactor as you go.
  • Set aside your implementation, shuffle the tests back into the hat, and repeat once or twice.

Exercise ideas:

  • spreadsheet (domain objects, parsing, representation of storage),
  • auction sniper (from Freeman and Pryce’s Growing Object-Oriented Software),
  • video rental store (from Fowler et al’s Refactoring),
  • currency (from Beck’s Test-Driven Development),

Sample Tests for a Spreadsheet:

  1. New sheet is empty
  2. Inserting into a sheet stores a value
  3. Numeric cells are formatted as numbers
  4. Cells retain literal values (for editing) Ex: 007 is edited as 007 but displays as 7.
  5. Cells starting with = are formulas.
  6. Constant formula Ex. =7
  7. Parentheses in formulas
  8. Multiplication in formulas
  9. Addition in formulas
  10. Precedence in formulas

Reflection Points:

  • Did you encode your tests the same each time?
  • Did the solutions come out basically the same? Why or why not?
  • Did you see alternative solution paths when you went through again?

Source:
Bill Wake, 2010. Inspired by "Scenes From a Hat" in improv.

Review – Growing Object-Oriented Software, Guided by Tests

Growing Object-Oriented Software, Guided by Tests, by Steve Freeman and Nat Pryce, ISBN 0-321-50362-7

 
Freeman and Pryce explain Test-Driven Development through an extended example. (They have a somewhat different perspective than I do, with much heavier use of mock objects. I'm not sure what drives this difference; it may be due to type of application, philosophical reasons, or just something I should learn.) I love the clarity with which they tackle the problem of driving in from all the way outside, and how they don't shy away from dealing with concurrency and persistence. It's high praise for me to say a book deserves further study; this book is one I'll definitely be reading again. 
 

 
 

Review – Structured Design

Structured Design. Edward Yourdon and Larry L. Constantine. Prentice-Hall, 1979.

This was one of the early structured "standard works" that I've only just gotten to for the first time. I'd learned things like coupling and cohesion, afferent and efferent flows, and the concept of factoring, but it's much stronger coming directly from this source. Not everything here is compatible with the way I think about design, but this is one of those books that deserve repeated study.

Review – Exploring Requirements

Exploring Requirements: Quality Before Design, Donald C. Gause and Gerald M. Weinberg. 1989, Dorset House.
This book is an exploration not just of gathering requirements, including the challenges of ambiguity. The authors describe how to clarify expectations by using functions, attributes, constraints, and preferences. They treat exploration of requirements as a human process, including discussion of how to facilitate different types of effective meetings. My favorite part is the set of context-free questions that apply to many situations. (Reviewed Aug., ’09)

Programming Language Puzzle

This puzzle contains the names of different programming languages. You solve it by writing in the missing letters.

Puzzle: (PDF)

Solution: (PDF)

 

Clues

If you don't want to see the whole solution, page down for a series of clues.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IAL ("International Algorithmic Language") was an early name for Algol.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

SIMPL-T was developed by a group led by Vic Basili.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

COBOL is in the puzzle, but none of its letters are showing.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

MLISP has none of its letters showing either. It's the name of a couple different Lisp variants.
 

Review – Clean Code

Clean Code, Bob Martin, Prentice-Hall, 2008.

Bob Martin tackles the challenges of making code sparkling clean. He provides numerous guidelines, and demonstrates their utility in action. I particularly appreciated some of the longer examples where he really works them over. You’ll especially find this book compelling if you’re interested in craftsmanship, refactoring, and/or concrete design. (Reviewed Feb., ’08)

Review – Current Trends in Programming Methodology, Vol. 4: Data Structuring

Current Trends in Programming Methodology, Volume 4: Data Structuring, Raymond T. Yeh, editor. Prentice-Hall, 1978.

In parallel with "structured programming," (which often focused on code structure), there was more esoteric work done on "structured data." A lot of this found fruition in things like container libraries. (Few people write their own hash tables any more.) But there was a deeper side of this that explores formal methods for data structures. This out-of-print collection of articles explores many aspects of the algebra of data structures. (Reviewed Dec., ’07)

Sudoku Solver

Sudoku is a fairly well-known type of puzzle. Solving it turns out to be easier than I expected, but a somewhat odd example of test-driven development.

There's been discussion on the extremeprogramming group about solving Sudoku puzzles.  I saw Ron Jeffries first do a spike, and then start solving. Then Cory Foy pitched in his version as well. I skipped to the bottom of his article and found out it wasn't doing an exhaustive search, but rather looking for simple "forced" cells.

From the discussion, I had decided I wouldn't be fancy on the indexing: I'll just use a simple array, and define constraints as indexes on the array. I also decided to first do an exhaustive depth-first search, and then performance-tune if necessary.

First Test, Unusually Large


Here's my first test:

  public void testBoardThatHasToApplyConstraints() {
  Board board = new Board(new int[] {
    1, 2,
    0, 1 });
  ArrayList constraints = new ArrayList();
  constraints.add(new Constraint(new int[] { 0, 1 }));
  constraints.add(new Constraint(new int[] { 2, 3 }));
  constraints.add(new Constraint(new int[] { 0, 2 }));
  constraints.add(new Constraint(new int[] { 1, 3 }));
  
  Board result = new Solver().solve(board, constraints);
  Board expected = new Board(new int[] {
    1, 2,
    2, 1 });
  assertEquals(expected, result);
}

This isn't a real Sudoku puzzle but it's close enough to get me started.

I don't often start with a test this big but I wanted to capture the direction I plan to go. I used this chance to create stubs for the Board, Constraint, and Solver classes and their methods:

 
public class Board {
    public Board(int[] cells) {}
}

 

public class Constraint {
    public Constraint(int[] indexes) {}
}

 

public class Solver {
    public Board solve(Board board, ArrayList constraints) {
         return null;
    }
}

As you'd expect, this test comes back red. It's definitely too advanced for where we are. So I'm going to rename the failing test method to xtestBoardThatHasToApplyConstraints(), (so it won't run) and take a smaller bite, what Kent Beck calls a Child Test.


Board Equality

My smaller step will be to test whether two boards are equal. It's much easier, and I have to do it anyways to make the final "assertEquals()" work, so it's a good step. I implemented the next series of tests one at a time, but I'll show them all together:

 
public class BoardTest extends TestCase {
    Board board;

    protected void setUp() {
        board = new Board(new int[] { 1, 2, 0, 1 });
    }

    public void testBoardNotEqualToNull() {
        assertFalse(board.equals(null));
    }

    public void testBoardNotEqualToNonBoardObject() {
        assertFalse(board.equals(new Object()));
    }

    public void testBoardsOfDifferentSizesArentEqual() {
        Board board2 = new Board(new int[] { 1, 2, 3 });
        assertFalse(board.equals(board2));
        assertFalse(board2.equals(board));
    }

    public void testBoardsWithDifferentContentsArentEqual() {
        Board board2 = new Board(new int[] { 1, 2, 1, 1 });
        assertFalse(board.equals(board2));
        assertFalse(board2.equals(board));
    }

    public void testBoardsWithSameContentsAreEqual() {
        Board board2 = new Board(new int[] { 1, 2, 0, 1 });
        assertEquals(board, board2);
        assertEquals(board2, board);
    }

    public void testEqualityIsTransitive() {
        Board board2 = new Board(new int[] { 1, 2, 0, 1 });
        Board board3 = new Board(new int[] { 1, 2, 0, 1 });

        assertEquals(board, board2);
        assertEquals(board2, board3);
        assertEquals(board, board3);
    }
}

The implementation is what you'd expect:

 
public class Board {
    int[] cells;

    
    public Board(int[] cells) {
        this.cells = cells;
    }

    public boolean equals(Object obj) {
        if (obj == null) return false;
        if (obj.getClass() != Board.class) return false;
        
        Board that = (Board) obj;
        if (this.cells.length != that.cells.length) return false;
        
        for (int i = 0; i < cells.length; i++) 
            if (this.cells[i] != that.cells[i])
                return false;
            
        return true;
    }}

A Small, Full Grid

Rather than tackle that large test I started with, I have a much simpler case. Let's look at a board with everything filled in, under no constraints.

 
public void testSolvedBoardComesBackAsIsWithNoConstraints() {
    Board board = new Board(new int[] { 1, 2, 2, 1 });
    Board result = new Solver().solve(board, new ArrayList());
    assertEquals(board, result);
}

A straightforward solution makes that work:

 
public Board solve(Board board, ArrayList constraints) {
    if (solved(board, constraints))
        return board;
    throw new Error("No solution found");
}

    
private boolean solved(Board board, ArrayList constraints) {
    return true;
}

This next test is the same, but with constraints added. Since we always return the board we were given, we expect it to pass (and it does). A Constraint is a set of "n" positions; the constraint it enforces is that each number 1..n must appear exactly once in the cells with those positions.

 
public void testSolvedBoardComesBackAsIsWithConstraints() {
    Board board = new Board(new int[] { 1, 2, 2, 1 });
    ArrayList constraints = new ArrayList();
    constraints.add(new Constraint(new int[] { 0, 1 }));
    constraints.add(new Constraint(new int[] { 2, 3 }));
    constraints.add(new Constraint(new int[] { 0, 2 }));
    constraints.add(new Constraint(new int[] { 1, 3 }));

    Board result = new Solver().solve(board, constraints);

    assertEquals(board, result);
}

 
I'll do a bit of "Fake It" here. I want to say the board is solved if no constraint says it isn't. I'll run through the constraints, but for now they'll all answer "true" (i.e., that the puzzle is solved as far as that constraint is concerned). The tests stay green

 
// In Board.java:
private boolean solved(Board board, ArrayList constraints) {
    for (int i = 0; i < constraints.size(); i++) {
        Constraint constraint = (Constraint) constraints.get(i);
        if (!constraint.solvedBy(board)) return false;
    } 
    return true;
}
// In Constraint.java:
public boolean solvedBy(Board board) {
    return true;
}

To show that Constraint isn't picky enough, let's add a test:

 
public void testConstraintIsNotSolvedIfAnyIntegerIsMissing() {
    Board board = new Board(new int[] { 1, 2, 0, 1 });
    Constraint constraint = new Constraint(new int[] { 2, 3 });
    assertFalse(constraint.solvedBy(board));
}

Constraint makes a list of values used and returns false if any (digits 1 or greater) is not exactly 1.

 
// In Constraint.java:
public boolean solvedBy(Board board) {
    int[] valuesUsed = new int[board.width()+1];
    for (int i = 0; i < indexes.length; i++) 
        valuesUsed[board.at(indexes[i])]++;
        
    for (int i = 1; i < valuesUsed.length; i++) 
        if (valuesUsed[i] != 1)
            return false;
            
    return true;
}
// In Board.java:
public class Board {
    int[] cells;
    int width;

    public Board(int[] cells) {
        this.cells = cells;
        width = (int) Math.sqrt(cells.length);
    }

    public int width() {
        return width;
    }
    public int at(int i) {
        return cells[i];
    }

    // etc.
}

Validity

We have the notion of whether a puzzle is solved (all constraints say they are solved). But there's another notion we need along the way: whether a board configuration is valid. A board is valid if it has no constraint that says it's not. And the constraints will allow 0s (empty cells), but not two cells that claim to own the same number. So: empty cells in a row is valid, but two 7s in a row are not.

 
public void testBoardIsValidIfNoConstraintsAreContradicted() {
    Board board = new Board(new int[] { 1, 2, 0, 1 });
    ArrayList constraints = new ArrayList();
    constraints.add(new Constraint(new int[] { 2, 3 }));
    
    assertTrue(new Solver().valid(board, constraints));
}

We can make valid() return true and get to green. It needs to do more when we test the other way:

 
public void testBoardIsInvalidIfAnyConstraintIsContradicted() {
    Board board = new Board(new int[] { 1, 2, 1, 1 });
    ArrayList constraints = new ArrayList();
    constraints.add(new Constraint(new int[] { 2, 3 }));

    assertFalse(new Solver().valid(board, constraints));
}

 

 
// In Solver.java: 
public boolean valid(Board board, ArrayList constraints) {
    for (int i = 0; i < constraints.size(); i++) {
        Constraint constraint = (Constraint) constraints.get(i);
        if (!constraint.accepts(board)) return false;
    }
    return true;
}
// In Constraint.java:

public boolean accepts(Board board) {
    int[] valuesUsed = new int[board.width()+1];
    
    for (int i = 0; i < indexes.length; i++) 
        valuesUsed[board.at(indexes[i])]++;
    
    for (int i = 1; i < valuesUsed.length; i++) 
        if (valuesUsed[i] > 1)
            return false;
        
    return true;
}

If you look back at the earlier Constraint.solvedBy() method, you'll see that the first half of these two methods is identical. We'll extract a helper method:

 
public boolean solvedBy(Board board) {
    int[] valuesUsed = valuesUsed(board);
    
    for (int i = 1; i < valuesUsed.length; i++) 
        if (valuesUsed[i] != 1)
            return false;
        
    return true;
}

public boolean accepts(Board board) {
    int[] valuesUsed = valuesUsed(board);
    
    for (int i = 1; i < valuesUsed.length; i++) 
        if (valuesUsed[i] > 1)
            return false;
        
    return true;
}


private int[] valuesUsed(Board board) {
    int[] valuesUsed = new int[board.width()+1];
    for (int i = 0; i < indexes.length; i++) 
        valuesUsed[board.at(indexes[i])]++;
    return valuesUsed;
}

There's still duplication – those loops are very similar – but we'll leave it. In Ruby or Smalltalk, it might be easier to get rid of.

Depth-First Search

We're up to the edge of solving real Sudoku. When I do them by hand, I have a handful of strategies (look for forced cells, consider both down and across, etc.) But at least for my skill level, the hard problems have always required me to do a search: try it one way and see what happens, and backtrack if that fails.

My worst-case strategy is a standard strategy from AI: depth-first search. One algorithm for that says:

  • Make a stack and push the initial candidates.
  • While there are still things in the stack:
    • Pop the top item.
    • If it's a solution, return it.
    • If it's not a solution but it's valid, generate any consequent candidates and push them on the stack.
    • If it's neither a solution nor valid, don't push anything
  • If you have an empty stack at the end, you found no solution.

To represent my candidates, I'll have a Candidate class. It's just a data bag with three things we need to track: the board, the next index to change, and the value to put there. It's one of those rare classes so simple I won't write a test for it.

 
public class Candidate {
    private Board board;
    private int index;
    private int value;
    
    public Candidate(Board board, int index, int value) {
        this.board = board;
        this.index = index;
        this.value = value;
    }

    public Board board() {
        return board;
    }

    public int index() {
        return index;
    }
    
    public int value() {
        return value;
    }
}

In a search like this, we'll frequently take a board, and want to modify one cell. If we work on a copy of the board, backtracking is easy – just don't use that copy any more. 

 
public void testBoardCanBeBuiltWithOneValueChanged() {
    Board board2 = new Board(board, 0, 2);
    assertEquals(2, board2.at(0));
    assertEquals(new Board(new int[] { 2, 2, 0, 1 }), board2);
    assertFalse(board.equals(board2));
}

Create that new constructor, copying the cells and the width. (I forgot to copy the width on my first try:(

 
public Board(Board old, int index, int value) {
    this.cells = new int[old.cells.length];
    for (int i = 0; i < cells.length; i++)
        cells[i] = old.cells[i];
    width = old.width;

    cells[index] = value;
}

The next test can force the depth-first algorithm.

 
public void testBoardThatHasToApplyConstraints() {
    Board board = new Board(new int[] { 1, 2, 0, 1 });
    ArrayList constraints = new ArrayList();
    constraints.add(new Constraint(new int[] { 0, 1 }));
    constraints.add(new Constraint(new int[] { 2, 3 }));
    constraints.add(new Constraint(new int[] { 0, 2 }));
    constraints.add(new Constraint(new int[] { 1, 3 }));

    Board result = new Solver().solve(board, constraints);

    Board expected = new Board(new int[] { 1, 2, 2, 1 });
    assertEquals(expected, result);
}

As I go through this exposition, this just seems like too big a step. Let's try a smaller step that checks the generation of candidates.
 

 
public void testIfIndexIsTooBigThenNothingGetsPushed() {
    Board board = new Board(new int[] {1, 2, 0, 1});
    Stack stack = new Stack();
    new Solver().pushCandidates(stack, board, 4);
    assertTrue(stack.isEmpty());
}

public void testIfCellAlreadyFilledOnlyThatBoardGetsPushed() {
    Board board = new Board(new int[] {1, 2, 0, 1});
    Stack stack = new Stack();
    new Solver().pushCandidates(stack, board, 0);
    assertEquals(1, stack.size());
    assertEquals(new Candidate(board, 0, 1), stack.peek());
}

public void testIfCellIsEmptyThenPushAllCandidates() {
    Board board = new Board(new int[] {1, 2, 0, 1});
    Stack stack = new Stack();
    new Solver().pushCandidates(stack, board, 2);
    assertEquals(2, stack.size());
    assertEquals(new Candidate(board, 2, 1), stack.elementAt(0));
    assertEquals(new Candidate(board, 2, 2), stack.elementAt(1));
}

This code makes those tests pass:

 
// In Solver.java:
public void pushCandidates(Stack stack, Board board, int index) {
    if (index >= board.width() * board.width()) return;

    if (board.has(index)) {
        stack.push(new Candidate(board, index, board.at(index)));
        return;
    }

    for (int i = 1; i <= board.width(); i++)
        stack.push(new Candidate(board, index, i));
}
// In Candidate.java:
public boolean equals(Object obj) {
    if (obj == null || obj.getClass() != Candidate.class) return false;
        
    Candidate that = (Candidate) obj;
    return this.board.equals(that.board) 
        && this.index == that.index 
        && this.value == that.value;
}

// In Board.java:
public boolean has(int index) {
    return cells[index] != 0;
}

Let's return to testBoardThatHasToApplyConstraints(). Now I'll take a go at the search algorithm. Confession – I accidentally passed board (rather than newBoard) to pushCandidates(), and used the debugger to see I had the wrong board.

 
public Board solve(Board board, ArrayList constraints) {
    Stack stack = new Stack();
    pushCandidates(stack, board, 0);

    while (!stack.isEmpty()) {
        Candidate candidate = (Candidate) stack.pop();
        Board newBoard = new Board(candidate.board(), candidate.index(), candidate.value());

        if (solved(newBoard, constraints)) 
            return newBoard;
        
        if (valid(newBoard, constraints))
            pushCandidates(stack, newBoard, candidate.index() + 1);
    }

    throw new Error("No solution found");
}

The  next two tests pass already. One's a small board, the other is the example from the Wikipedia article cited above. (I also tried a couple "hard" ones from the back of a Sudoku book, but I won't include them here.)

 
public void test4x4Board() {
    Board board = new Board(new int[] { 
            1,2, 0,0,
            0,0, 1,2,
            0,1, 0,4,
            4,0, 0,0
    });
    ArrayList constraints = new ArrayList();
    
    // Row constraints
    constraints.add(new Constraint(new int[] {  0,  1,  2,  3 }));
    constraints.add(new Constraint(new int[] {  4,  5,  6,  7 }));
    constraints.add(new Constraint(new int[] {  8,  9, 10, 11 }));
    constraints.add(new Constraint(new int[] { 12, 13, 14, 15 }));

    // Column constraints
    constraints.add(new Constraint(new int[] {  0,  4,  8, 12 }));
    constraints.add(new Constraint(new int[] {  1,  5,  9, 13 }));
    constraints.add(new Constraint(new int[] {  2,  6, 10, 14 }));  
    constraints.add(new Constraint(new int[] {  3,  7, 11, 15 }));
    
    // Box constraints
    constraints.add(new Constraint(new int[] {  0,  1,  4,  5 })); 
    constraints.add(new Constraint(new int[] {  2,  3,  6,  7 }));  
    constraints.add(new Constraint(new int[] {  8,  9, 12, 13 }));  
    constraints.add(new Constraint(new int[] { 10, 11, 14, 15 })); 
   
    Board result = new Solver().solve(board, constraints);

    Board expected = new Board(new int[] { 
            1,2, 4,3,
            3,4, 1,2,
            2,1, 3,4,
            4,3, 2,1
    });
       
    assertEquals(expected, result); 
}
    
public void testBigBoardWiki() {
    Board board = new Board(new int[] {
            5,3,0, 0,7,0, 0,0,0,
            6,0,0, 1,9,5, 0,0,0,
            0,9,8, 0,0,0, 0,6,0,
            
            8,0,0, 0,6,0, 0,0,3,
            4,0,0, 8,0,3, 0,0,1,
            7,0,0, 0,2,0, 0,0,6,
                
            0,6,0, 0,0,0, 2,8,0,
            0,0,0, 4,1,9, 0,0,5,
            0,0,0, 0,8,0, 0,7,9
    });

    ArrayList constraints = constraints9x9();

    Board result = new Solver().solve(board, constraints);

    Board expected = new Board(new int[] {
           5,3,4, 6,7,8, 9,1,2,
           6,7,2, 1,9,5, 3,4,8,
           1,9,8, 3,4,2, 5,6,7,

           8,5,9, 7,6,1, 4,2,3,
           4,2,6, 8,5,3, 7,9,1,
           7,1,3, 9,2,4, 8,5,6,

           9,6,1, 5,3,7, 2,8,4,
           2,8,7, 4,1,9, 6,3,5,
           3,4,5, 2,8,6, 1,7,9
    });
        
    assertEquals(expected, result);      
}
    
private ArrayList constraints9x9() {
    ArrayList constraints = new ArrayList();

    // Row constraints
    constraints.add(new Constraint(new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8 }));
    constraints.add(new Constraint(new int[] { 9, 10, 11, 12, 13, 14, 15, 16, 17 }));
    constraints.add(new Constraint(new int[] { 18, 19, 20, 21, 22, 23, 24, 25, 26 }));
    constraints.add(new Constraint(new int[] { 27, 28, 29, 30, 31, 32, 33, 34, 35 }));
    constraints.add(new Constraint(new int[] { 36, 37, 38, 39, 40, 41, 42, 43, 44 }));
    constraints.add(new Constraint(new int[] { 45, 46, 47, 48, 49, 50, 51, 52, 53 }));
    constraints.add(new Constraint(new int[] { 54, 55, 56, 57, 58, 59, 60, 61, 62 }));
    constraints.add(new Constraint(new int[] { 63, 64, 65, 66, 67, 68, 69, 70, 71 }));
    constraints.add(new Constraint(new int[] { 72, 73, 74, 75, 76, 77, 78, 79, 80 }));

    // Column constraints
    constraints.add(new Constraint(new int[] { 0, 9, 18, 27, 36, 45, 54, 63, 72 }));
    constraints.add(new Constraint(new int[] { 1, 10, 19, 28, 37, 46, 55, 64, 73 }));
    constraints.add(new Constraint(new int[] { 2, 11, 20, 29, 38, 47, 56, 65, 74 }));
    constraints.add(new Constraint(new int[] { 3, 12, 21, 30, 39, 48, 57, 66, 75 }));
    constraints.add(new Constraint(new int[] { 4, 13, 22, 31, 40, 49, 58, 67, 76 }));
    constraints.add(new Constraint(new int[] { 5, 14, 23, 32, 41, 50, 59, 68, 77 }));
    constraints.add(new Constraint(new int[] { 6, 15, 24, 33, 42, 51, 60, 69, 78 }));
    constraints.add(new Constraint(new int[] { 7, 16, 25, 34, 43, 52, 61, 70, 79 }));
    constraints.add(new Constraint(new int[] { 8, 17, 26, 35, 44, 53, 62, 71, 80 }));

    // Box constraints
    constraints.add(new Constraint(new int[] { 0, 1, 2, 9, 10, 11, 18, 19, 20 }));
    constraints.add(new Constraint(new int[] { 3, 4, 5, 12, 13, 14, 21, 22, 23 }));
    constraints.add(new Constraint(new int[] { 6, 7, 8, 15, 16, 17, 24, 25, 26 }));
    constraints.add(new Constraint(new int[] { 27, 28, 29, 36, 37, 38, 45, 46, 47 }));
    constraints.add(new Constraint(new int[] { 30, 31, 32, 39, 40, 41, 48, 49, 50 }));
    constraints.add(new Constraint(new int[] { 33, 34, 35, 42, 43, 44, 51, 52, 53 }));
    constraints.add(new Constraint(new int[] { 54, 55, 56, 63, 64, 65, 72, 73, 74 }));
    constraints.add(new Constraint(new int[] { 57, 58, 59, 66, 67, 68, 75, 76, 77 }));
    constraints.add(new Constraint(new int[] { 60, 61, 62, 69, 70, 71, 78, 79, 80 }));
    
    return constraints;
}

Analysis

This snapped together fairly well. It's taken me longer to write it up than it did to solve.

The oddest part for me was my first test. I wanted a test that was something like the real problem, and this was a good test for that. But it was a larger first step than I usually use. I suspect this is because I had a solution "trail" in mind.

There's still room for some refactoring, though Java doesn't make it particularly easy to get rid of similar loops. You could pull out a Constraints class (rather than the ArrayList); Solver.solved() and Solver.valid() could move over there. Or maybe Solver really should just be Constraints.

There's some room for work in the Board class. I tried it with clone() one time and with a constructor the other. A third alternative would be to put a copy-and-modify operation on Board (having the board generate a copy of itself with the cells modified). I'd keep that up my sleeve if I needed to later.

This had some of the flavor of re-implementing a known algorithm: I certainly had a certain approach in mind when I started. The steps didn't feel so big when I did them the first time, but I felt a strong urge to fill in the blanks while writing it up. (I wonder which impulse is better.)

The algorithm itself turned out to be as nice as I expected. The setting up of constraints is somewhat ugly, but it was quick and easy. I made my list and checked it twice. If I were generalizing beyond this case, I'd definitely revisit constraint setup.

The biggest surprise for me was in performance. I had fully expected this program to need a lot of performance tuning. After all, I didn't do anything to cut down the huge search space. But even the hardest boards I've tried run in about a second.

Resources

[Written July 10, 2006.]