From 0 to Composite (and Back Again)

In refactoring, you make small steps in changing code, to preserve semantics but improve other properties. This paper looks at how you might evolve software to use the Composite pattern, starting from nothing. (See Design Patterns by Gamma et al. for a description of the Composite pattern.)

0

In the simplest case, an object has no knowledge of another object. No references? No problem. The syntax for this situation is both simple and well-known.

1

There comes a day when an object learns about another object. In Java, this is done via a reference:

public class ClassA {
        ClassB member;
    }

Even a simple reference involves a number of decisions:

Protection. The general rule is to use the least public protection required (from private, package, protected, and public).
Static or member. If there’s a single reference shared by all members of that class, it should be declared static. If each instance of the class needs its own reference, it’s not static.
Initialization. You can initialize a value in the declaration, in the constructor, or via a member function. You want to maintain the validity of the reference. By setting it in the declaration or constructor, you can ensure that it moves from valid reference to valid reference (e.g., no chance of NullPointerException). You may pay a price in time or memory, though. If you don’t set the value until some member function is called, you risk that a sequence of calls (in an unexpected order) can retrieve a null value.
Update. Some references are intended to be set once and treated as read-only after that. (The object referred to may change, but the reference itself will not.) These are usually best dealt with using declaration or constructor initialization. Other references are meant to change, and you’ll provide some methods that have that effect. Note that it is harder to think about values that change than those that don’t.

2 or a Few

If you have a reference to a different class, you normally create a separate member variable. If the second reference is of the same type as the first, you have to decide whether or not to make it separate. A second reference may presage a third, fourth, etc. – see the next section (“Many“).

If you can only have 2 or 3 of something, just declare the new one(s) as a new field. For example, if you’re dealing with a transfer between two accounts, there’s one account being credited and one being debited. We see no reason this will evolve to n-way transfers, so two fields will suffice. You might make the same decision for a 2-d coordinate (x and y).

Many

Once you grow to two or more of something, you should consider allowing for an arbitrary number of them. Java has three alternatives:

Array. You can create an array of the type being referenced:
```
ObjectX[] array = new ObjectX [count];
```
(Recall that in Java the array and its contents are initialized separately, so this statement created an array of null references.) The nice part about using an array is that it contains objects of a known type. The downside is the requirement that you know the size of the array to allocate it, and it’s difficult to change its size.
Collection. “Collection” is meant to cover all Java’s generic collection objects: Vector, List, etc. In Java, containers are based on storing and fetching an Object. The user of the collection must know to which type to cast the result. (If Java had some sort of template or generic type facility, this cast would be unnecessary.) The benefit of using collections is that they provide flexible, variable-size containers, with a variety of performance characteristics. The downside is their requirement for casts.
Specialized collection object. Some collections need to maintain specialized information. In these cases, you might introduce a special type for the collection. For example, suppose you’re creating a checkbook program that needs to track a number of transactions. You could maintain an array or Vector of these, but a set of transactions really represent a whole account. Since Accounts have other significant information attached to them (account number, total, etc.), we might have a new class Account with a list of Transactions.
```
Account <>------* Transaction
```
Now, other objects can maintain a reference to Account instead of to a bunch of Transactions. Account can use whatever collection method is most appropriate. (If collections are used, Account can hide the necessary casts – an example of Fowler’s refactoring “Encapsulate Downcast”.)

The good side of a specialized object is that it creates a meaningful new object where we can attach behavior. It’s more work than creating a generic collection, though. (See also M. Fowler’s refactoring “Encapsulate Collection”.)

Composite

A list of items may suffice for a long time, but you may eventually need a more complex structure.

Suppose you have a set of products, and someone chooses at most one of each, maintained in a list of current orders. One day, the marketing department gets the idea of having “bundles”, where a bundle is a list of existing products. You can try to maintain the existing list structure, adding the products in the bundle into the “current order” list. The problem comes when a user wants to remove a bundle: you find you’ve lost track of what was in the bundle.

The crucial step to solving this is to say, “What if a bundle were another product?” Then we have products containing products. This sort of recursion screams out “Composite”.

You might think you don’t need the full complexity of composite – it allows bundles that contain bundles, and who would need that? That may or may not be true. It’s certainly possible to conceive of a system that allows only products or bundles containing products (but not bundles containing bundles).

This situation is much like the 0-1-infinity rule that discourages using 2 or 3 explicit members: if you have two levels of products, odds are good that you’ll have more. (You’ll probably find it less complex to allow the recursion).

How to Move from Custom Collection to Composite

Move your design to a custom collection type:
```
List <>-----* Item
```
Make sure the List has methods for add(Item), remove(Item), and getChild(int). (Test.)
Create a new parent class “Parent”, with a protected constructor, that has the add(Item), remove(Item), and getChild(int) methods. (Test.) [The names of your classes should be appropriate for your domain; in the example, we might use “Product”, “Bundle”, and “Item”.]
Make Parent the superclass of Item. (Test.)
Modify List’s and Parent’s methods to take and store instances of Parent (instead of Item). You may need to declare and stub out any methods from Item to Parent. The compiler will tell you which ones they are. (Test.)
Make Parent the superclass of List. (Test.)
Move the signature of any (remaining) operation common to both List and Item up into the parent. Provide a default implementation if possible. (Test.)
For each method, ensure that it’s implementation is reasonable. You may be able to make some of them abstract in the Parent. (Test.)
Locate each reference to Item. See if it can make sense to operate on Parent instead, and generalize the type if possible. (Test.)
Locate each reference to List. See if it makes sense to operate on Parent instead, and generalize the type if possible. (Test.)

And Back…

When do you move back from Composite to a collection?

When your objects are no longer in a meaningful hierarchy. (You have items and lists of items, but no lists of lists.)
When there are no shared operations (left) between lists and items.

If you have the data hierarchy of a Composite, but no shared operations, you’re in a situation similar to a “data bag” or “struct class”: you may want to observe what clients do with the class – perhaps there’s functionality that belongs in the Composite rather than its client.

How to Move from Composite to Collection

Modify the add(), remove(), and getChild() methods of Parent and List to operate on Item rather than Parent. The compiler will let you know if anybody is still trying to use “bundles of bundles”. (Test.)
Make sure List has all the methods of Parent. (Move an implementation down if necessary.) (Test.)
Make List’s superclass be the superclass of Parent. (Test.)
Make sure Item has any methods it needs from Parent. (It shouldn’t need the list methods.) (Test.)
Make Item’s superclass be the superclass of Parent. (Test.)
Nobody should be referencing the Parent class any more. Remove it. (Test.)
Review the operations shared by Item and List to see if any should be removed. (Test.)

…To 0

The remaining changes are straightforward:

Change a constant-size collection to use an array.
Move from a small array to explicit members.
Delete members until there’s only one left.
Delete the last reference, and you’re back to “0”.

Summary

We’ve shown how a series of fairly small steps can move you through the path from 0 references to a Composite. This suggests that we are never “painted into a corner” if we start simpler, even if our data will grow to the complexity of the trees used in Composite.

We’ve also shown how to refactor the other way: from Composite down to nothing. This assures us that we can simplify our application if the requirements become simpler.

Resources

Design Patterns: Elements of Reusable Object-Oriented Software, Gamma et al., 1995.
Refactoring: Improving the Design of Existing Code, Martin Fowler, 1999.
Command, Interpreter, Visitor

[Written 2-8-2000.]