Thursday, April 7, 2011

Economy of Expression

One striking features that I’ve always found with elegant code is how it manages do so much with so little. In its expression it is dramatically stingy, yet in its functionality it is extremely broad and applicable to a large number of similar problems.

Software ultimately is just a very long sequence of instructions for a computer to follow. When building up these sequences, we can code them each very precisely to the specific instance of a specific problem, or we can take a step back and then generalize them to allow for a wider range of reuse.

All computer languages do this by providing sets of standard libraries that accomplish common tasks. Many of these libraries find simple, elegant abstractions that allow for a huge range of functionality by providing an underlying set of consistent primitives, on which programmers can build larger more specific blocks of code. Get the primitives right and the next layer gets cleaner and easier to build. Get them wrong, and the eccentricities percolate into the code above, making it more difficult.

Great examples of reuse in code understand this well. Instead of pounding out an endless series of nearly duplicated lumps of code, the programmers have built layer after layer of clean, well-thought out Lego-like blocks that get assembled into ever higher functionality.

The key to doing this isn’t by allowing for a mass of configurable options, or by forcing a large amount of separate declarative bindings. It isn’t by allowing lots of arguments to the methods, or by wiring everything up with arbitrarily convoluted underlying rules. Rather, when done well, it usually comes from a simple clean abstraction, iron-like consistency and a serious amount of economy of expression in both the internals and in how it is used externally.

That is, a simple and consistent abstraction yields a small number of primitives that are easily interwoven at a specific level. And handled correctly, it encapsulates all of the nasty details while providing an obvious set of default behaviors. Simple things are simple to do. At its best, the calling code reads very straight-forward and is naturally self-documenting. You don’t have to know the underlying details or read reams of incoherent documentation to get a correct sense of the underlying behavior. You can use it, it is obvious, and you can move on.

This often happens with well-written libraries, but the principle can be applied to all software code. A large application may have many layers, but each layer stands on its own as a simple, readable and easily understandable work.

One way to judge the degree of success in achieving this is to give the work to another programmer that has never seen it before. If they can get a fairly good grasp of what the code is doing nearly instantly, then it has these qualities. If they are confused, or turned around, or need extra documentation then the code does not speak for itself very clearly. Or the abstraction is too convoluted (or the programmer is too junior for this type of abstraction).

To get really strong economy of expression the primitives must be a small number, not overlapping and they must cover all of the possible operations. A non-programming example is +, -, * and /. With just four operators, one can do a large amount of arithmetic. AND, OR and NOT is another example of a consistent family of primitives (as is NAND and NOR). Applying these primitives to objects (mathematical in this case) in various combinations, provides for a wide range of applications. There are no oddly overlapping operators like addOneAndMultipleByTwo, since at the level they are getting used, it would be unnecessary and confusing.

Simple primitives, well-thought-out lead to very clean implementations underneath. They allow any layer built on top to be clean and readable. They simplify, encapsulate and self-document some underlying complexity, while providing a strong base for building on.

When done in a large system, at each level, it makes it really easy to view each level individually, with its inherent complexity, and understand how it operates with respect to the level below. It makes it easy to change, without having to worry about unintended side-effects. It makes it easy to extend, to cover new functionality. This, by definition, is the essence of elegance. Clean and simple with no weird bits or thingys that will get posted to WTF.  Done well, further development work on an elegant system becomes interesting, fun and fast as opposed to being time-consuming, painful and dangerous.