XPlorations
|
From 0 to Composite (and Back Again)
| February, 2000
|
|
In refactoring, you make small steps in changing code, to preserve semantics
but improve other properties.
This paper looks at how you might evolve software to use the
Composite pattern, starting from nothing. (See Design Patterns by
Gamma et al. for a description of the Composite pattern.)
|
0
In the simplest case, an object has no knowledge of another object. No
references? No problem. The syntax for this situation is both simple and
well-known.
1
There comes a day when an object learns about another object. In Java,
this is done via a reference:
public class ClassA {
ClassB member;
}
Even a simple reference involves a number of decisions:
-
Protection. The general rule is to use the least public protection
required (from private, package, protected, and public).
-
Static or member. If there's a single reference shared by all members
of that class, it should be declared static. If each instance
of the class needs its own reference, it's not static.
-
Initialization. You can initialize a value in the declaration, in
the constructor, or via a member function. You want to maintain the validity
of the reference. By setting it in the declaration or constructor, you
can ensure that it moves from valid reference to valid reference (e.g.,
no chance of NullPointerException). You may pay a price in time
or memory, though. If you don't set the value until some member function
is called, you risk that a sequence of calls (in an unexpected order) can
retrieve a null value.
-
Update. Some references are intended to be set once and treated
as read-only after that. (The object referred to may change, but the reference
itself will not.) These are usually best dealt with using declaration or
constructor initialization. Other references are meant to change, and you'll
provide some methods that have that effect. Note that it is harder to think
about values that change than those that don't.
2 or a Few
If you have a reference to a different class, you normally create a separate
member variable. If the second reference is of the same type as the first,
you have to decide whether or not to make it separate.
A second reference may presage a third, fourth, etc. - see the next
section ("Many").
If you can only have 2 or 3 of something, just declare the new one(s) as a new field.
For example, if you're dealing with a transfer between two accounts, there's one account
being credited and one being debited. We see no reason this will evolve
to n-way transfers, so two fields will suffice. You might make the same
decision for a 2-d coordinate (x and y).
Many
Once you grow to two or more of something, you should consider allowing
for an arbitrary number of them. Java has three alternatives:
-
Array. You can create an array of the type being referenced:
ObjectX[] array = new ObjectX [count];
(Recall that in Java the array and its contents are initialized separately,
so this statement created an array of null references.) The nice part about
using an array is that it contains objects of a known type. The downside
is the requirement that you know the size of the array to allocate it,
and it's difficult to change its size.
Collection. "Collection" is meant to cover all Java's generic collection
objects: Vector, List, etc. In Java, containers are based on storing and
fetching an Object. The user of the collection must know to which type
to cast the result. (If Java had some sort of template or generic type
facility, this cast would be unnecessary.) The benefit of using collections
is that they provide flexible, variable-size containers, with a variety
of performance characteristics. The downside is their requirement for casts.
Specialized collection object. Some collections need to maintain
specialized information. In these cases, you might introduce a special
type for the collection. For example, suppose you're creating a checkbook
program that needs to track a number of transactions. You could maintain
an array or Vector of these, but a set of transactions really represent
a whole account. Since Accounts have other significant information attached
to them (account number, total, etc.), we might have a new class Account
with a list of Transactions.
Account <>------* Transaction
Now, other objects can maintain a reference to Account instead of to a
bunch of Transactions. Account can use whatever collection method is most
appropriate. (If collections are used, Account can hide the necessary casts
- an example of Fowler's refactoring "Encapsulate Downcast".)
The good side of a specialized object is that it creates a meaningful
new object where we can attach behavior. It's more work than creating a
generic collection, though. (See also M. Fowler's refactoring "Encapsulate
Collection".)
Composite
A list of items may suffice for a long time, but you may eventually need
a more complex structure.
Suppose you have a set of products, and someone chooses at most one
of each, maintained in a list of current orders. One day, the marketing
department gets the idea of having "bundles", where a bundle is a list
of existing products. You can try to maintain the existing list structure,
adding the products in the bundle into the "current order" list. The problem
comes when a user wants to remove a bundle: you find you've lost track
of what was in the bundle.
The crucial step to solving this is to say, "What if a bundle were another
product?" Then we have products containing products. This sort of recursion
screams out "Composite".
You might think you don't need the full complexity of composite - it
allows bundles that contain bundles, and who would need that? That may
or may not be true. It's certainly possible to conceive of a system that
allows only products or bundles containing products (but not bundles containing
bundles).
This situation is much like the 0-1-infinity rule that discourages using
2 or 3 explicit members: if you have two levels of products, odds
are good that you'll have more. (You'll probably find it less complex to
allow the recursion).
How to Move from Custom Collection to Composite
-
Move your design to a custom collection type:
List <>-----* Item
Make sure the List has methods for add(Item), remove(Item),
and getChild(int). (Test.)
-
Create a new parent class "Parent", with a protected constructor,
that has the add(Item), remove(Item),
and getChild(int) methods. (Test.) [The names of your
classes should be appropriate for your domain; in the example, we might
use "Product", "Bundle", and "Item".]
-
Make Parent the superclass of Item. (Test.)
-
Modify List's and Parent's methods to take and store instances of Parent (instead of
Item). You may need to declare and stub out any methods from Item to Parent. The compiler
will tell you which ones they are. (Test.)
-
Make Parent the superclass of List. (Test.)
-
Move the signature of any (remaining) operation common to both List and Item up into
the parent. Provide a default implementation if possible. (Test.)
-
For each method, ensure that it's implementation is reasonable. You may be
able to make some of them abstract in the Parent. (Test.)
-
Locate each reference to Item. See if it can make sense to operate on Parent
instead, and generalize the type if possible. (Test.)
-
Locate each reference to List. See if it makes sense to operate on Parent
instead, and generalize the type if possible. (Test.)
And Back...
When do you move back from Composite to a collection?
- When your objects are no longer in a meaningful hierarchy. (You have items and lists
of items, but no lists of lists.)
- When there are no shared operations (left) between lists and items.
If you have the data hierarchy of a Composite, but no shared operations,
you're in a situation similar to a "data bag" or "struct class": you may
want to observe what clients do with the class - perhaps there's functionality
that belongs in the Composite rather than its client.
How to Move from Composite to Collection
- Modify the add(), remove(), and getChild()
methods of Parent and List to operate on Item rather than Parent. The compiler
will let you know if anybody is still trying to use "bundles of bundles". (Test.)
- Make sure List has all the methods of Parent. (Move an implementation down
if necessary.) (Test.)
- Make List's superclass be the superclass of Parent. (Test.)
- Make sure Item has any methods it needs from Parent. (It shouldn't need
the list methods.) (Test.)
- Make Item's superclass be the superclass of Parent. (Test.)
- Nobody should be referencing the Parent class any more. Remove it. (Test.)
- Review the operations shared by Item and List to see if any should be removed.
(Test.)
...To 0
The remaining changes are straightforward:
- Change a constant-size collection to use an array.
- Move from a small array to explicit members.
- Delete members until there's only one left.
- Delete the last reference, and you're back to "0".
Summary
We've shown how a series of fairly small steps can move you through the path from
0 references to a Composite. This suggests that we are never "painted into a corner"
if we start simpler, even if our data will grow to the complexity of the
trees used in Composite.
We've also shown how to refactor the other way: from Composite down to nothing.
This assures us that we can simplify our application if the requirements
become simpler.
Resources
[Written 2-8-2000.]
|