Data Flow Analysis for Refactoring

As you refactor, you sometimes need to understand how the data flow through the code. We’ll look at several ways that can reveal that information.

Data Flow Analysis

Data flow analysis tracks the creation and use of data through the code in an app. It can be local (within a method) or global (across the whole application).

Compilers often use this information to make code optimizations. (See the References.) For example, if a variable is defined (and has no side effects) but never used, it is dead code and can be removed.

We use data flow analysis in refactoring too: to decide whether variables are being reused, to reorder statements, to decide what to extract (e.g., is a variable local or a parameter?), etc.

Sometimes, the IDE’s refactorings take care of this for us; other times, we must analyze data flow manually.

Find Definition / Find Callers

IDEs often let you find the definition of any item, or if at the definition, find its uses. (For example, Command-click on a name may take you to its definition.)

The nicest part of this is that you’re getting a semantic result – when it really is the same item, not just something that has the same name.

Search and Highlighting

Basically every IDE can search for text, and you can use this for finding definitions and uses. Some IDEs automatically highlight any text that matches the current selection, perhaps also putting marks in the gutter. This gives you a no-work way to find things.

However:

  1. It only shows matches in the current file (though you can broaden the search).
  2. It works textually – giving you false positives when a name is independently reused.

Manually Hide the Definition

You can comment out a declaration, or make it more private. The compiler will tell you what usages are affected.

Embed in Braces or Closures

As you move from information to action, one way to isolate variables is to use braces or closures. In many languages, a temporary variable is restricted to the scope it’s declared in, and those inner to that. Thus, if you add a left brace before the definition, and after you think it’s done being used, you isolate that value.

To do this, you have to respect control flow; you can’t open a scope in the middle of an else clause, then close it a couple statements after the surrounding “if”.

Closures give you a similar opportunity. They create an anonymous function, and have more sophisticated rules about arguments and “captured” variables. See the article by Jay Bazuzi in the references; it shows how to use a C++ closure to extract a method.

Extract Method

If your IDE can Extract Method, it probably lets you extract a block of code, and shows you the parameters and return values before completing the refactoring. If it isn’t how you think it should be, you can cancel, rearrange code, and try again.

In some cases, you can extract first, then use Introduce Parameter to move some code back out to the call.

Example: Split Temporary Variable

This refactoring helps when you have a temporary variable that’s being reused (especially in a long method). Basically, you identify independent usages, and give them distinct variable names, so you can reduce their scope.

Reducing the scope of variables makes it easier to extract methods from a long method.

The code looks something like this:

...
var temp = ...   // the only declaration
:
use temp
: 
temp = ...  // change its value
:
use temp
: 
etc.

This isn’t too hard if it’s straight-line code, but usually you have to worry about conditionals and loops. You want to be extra careful if temp is computed in terms of its previous value.

To split the variable, you might temporarily insert braces to help make sure you understand the relevant scope. Then you can follow the normal path of change, working top to bottom.

Conclusion

Many refactorings require some level of data-flow analysis to make sure they’re safe or to do the job. In many cases, a refactoring IDE can help; in other cases, you do the work manually. When you’re working manually, lean on the IDE and compiler as much as you can.

References

Data-flow analysis.” Wikipedia. Retrieved 2021-06-13.

Extract Function,” by Jay Bazuzi. Retrieved 2021-06-13.