The Compiler as a Refactoring Aid, by Tahir Hashmi

January 23, 2013

Recently, I sat down to refactor a Go application with a high-level design objective in place. The application had two conceptually separate entities implemented in different files but mashed into a single package. I needed to separate them out into their own packages. I wasn’t using an IDE — just Emacs with basic formatting and non-contextual auto-complete aids.

I started out by creating a new directory for the package to be split out and moved the files that contained most of the relevant code into that directory, without thinking of the consequences. I could just invoke the compiler and let it guide me through the process of fitting the pieces of the puzzle together. One of the nice features of modern compilers is that they don’t continue dumping out errors beyond a limit. This allows fixing a program in small steps, going by the changes in errors produced by the compilers.

The first thing the compiler told me about was all the variables that got hidden due to things being moved into a new package. While working that out, it also helped me discover that the interface being used by Entity A to access Entity B (one that moved to the new package) only had private methods. Whoa! This is a semantic issue, which automatic refactoring tools that help with moving code around or creating new classes etc., can’t deal with.

Next, I tried to access the invisible variables by importing the new package, but the compiler complained, “ import cycle not allowed“. Nice. I could work out the dependency tree and information passing after having separated the packages, instead of first figuring out the dependencies and then moving code. See how the compiler is guiding me toward better design as well?

At this point, some of you might think this is a daft way of going about refactoring and I should have worked out the entire design on paper before touching the code. But is it daft, really? Here’s what refactoring guided by a “good” compiler is allowing me to do:

It lets me access code as I’m working out a design. Any design or refactoring done without referring to code is prone to be erroneous.
It ensures that I don’t miss out on a code path. The compiler checks more code paths than I can bother to follow in my mind and it reveals problems with those paths.
It allows me to do top-down refactoring. I make the big, disruptive change that satisfies my design objective first, and the compiler guides me through the details of making that change work.
It allows me to evaluate the impact of different design decisions on code immediately, instead of having to guess.
The best thing, though, is that I reduce the mental context to be carried. This is a significant benefit since it allows me to refactor in smaller sessions. I can commit partially fixed code and carry on from where I left the last time. I don’t need to remember what needed to be done because the compiler reminds me about it.

I followed roughly the same refactoring practices while coding with interpreted languages, but having to execute the program to find errors added a level of complexity, not to mention more of print/step-through debugging. It also left more for manual code inspection to ensure all corners were covered since having 100% code coverage in tests is not always feasible.

Using a compiler that does static analysis just improves the whole process considerably. Using a compiler along with a high productivity language like Go makes it fun as well!