Saturday, April 25, 2015

Special Case Considered Harmful

Without a doubt, the all-time quickest way to implement any new variability in an existing code base is to create a new branch with if/else statements, copy & paste the original code into its new block, and then modify the copy. You don't have to spend any time thinking. It's quick. It's initially painless. And it is one of the most popular ways to create code.

As a result of this, we often see programs that consist of layer upon layer of nested branches; sometimes four levels deep, sometimes worse. Advanced programmers avoid this by hiding these special cases in a mass of methods or functions.

Although the code itself gets larger, the testing of conditionals is usually cheap so there aren't any real performance penalties associated with this type of code. It may not be optimal, but it's not really slow either.

Testing does require more effort; forgetting to check the functionality of a newly added special case can easily mean that it gets released accidentally. If the branches are deeply nested, it can take considerable effort to unwind all of them them into a complete list of test cases. Most often that's not actually done, leading to bugs.

Extending the code is where the worst hit is taken. A few branches are not hard to deal with, but as they quickly build up, it becomes harder and harder to guess their behaviour just by reading the code. With lots of test cases necessary to cover the whole range of behaviour, it also becomes harder to understand when using a debugger. Thus you can't read it and it's no easy task to walk it either.

As the difficulties mount, the uncertainty grows -- probably exponentially -- until any new change to the code has a much better chance of being wrong then it does of working properly. At this point, since the code can no longer be safely changed, other worse options are usually selected.

At the root of this very common problem are special cases. If there is only one way in the code to handle the all of the data, then everything goes through the same logic and there is no need for any branches. The code is almost trivial.

Still, most people find it easier to solve big problems by breaking off a subset and then handling it in a 'special' way.

It's taught to us offhandedly in that we learn to solve problems by decomposing them. That's fine, except when that decomposition cuts across deeply related problems. Breaking up everything into special cases might be an effective way to understand how to go about solving the pieces, but the second half of that approach -- that isn't taught -- is to bring it all together again for the whole solution.

Special cases are often highly favored in many domains, thus the domain experts and analysts are always a significant source of them entering into the project. Big systems get built by going after the low hanging fruit first, then picking off the lessor cases one by one. This insures that progress is visible right from on onset of the project.

Thus the proliferation of special cases is fast, easy and we're often led to believe that it is the correct way to handle complex problems. That this is then mapped directly into deeply nested code or a quagmire of functions is not really a surprise. But it doesn't have to be that way.

One of the most visible aspects when reading any elegant code is that it does not have either an excessive number of branches nor too many loops. The code is organized, the flow through it is clear and the special cases few or non-existent. Clarity comes from the level just above the instructions, done well it keeps the logic from spewing off into a myriad of spaghetti-like flows.

From those rare examples, we can easily conclude that minimizing branches is both good and definitely worthwhile if we are intent on having the code continue to expand. The follow up questions then are "how?" and "what's the minimum number of branches?"

Long ago, when I first started programming, there was always a lot of effort put into simplifying coding logic. The earlier languages offered little guidance in structure, so the programmers had to spent more time back then, trying to work out appropriate ways to keep it from getting out of control. Object Oriented programming, and indirectly design patterns, embedded some of that organization directly into the languages themselves, but at the expense of the newer generations of programmers no longer understanding why well-structured code was important. The emphasis shifted to gluing stuff together as fast as possible. That might work for consolidating clumps of existing technologies, but it's an unmanageable nightmare when applied to a sea of special cases coming in from the domain side.

What's necessary is to avoid implementing special cases wherever and whenever possible. And to cleanup some of the existing mess, it is critical to merge back any existing special cases into the mainstream. Fortunately special cases are easy to spot since they exist as branches, and they are easy to remove, you just get rid of the if/else statements and merge the code together.

This does require some thought and practice, but it is not rocket science and it becomes easier to do as you gain more experience.

In the really easy cases, you can drop a branch when the difference between the two underlying blocks of code is only a variable or two. That is, you create new variables, assign them values at the beginning and then perform the sequence of actions, passing in the variables where appropriate. In the worst case, you might have to go back aways through the code, adding back the variables right up until the data loaded. I'll get back to that approach later.

For some cases, the difference is actually contained in a larger set of variables. This is handled the same way as individual variables, but with a structure or object instead of a single variable. Keeping related variables together and building up complex structures to pass around is a strong approach to organizing any code.

To get rid of some cases, the lists of executed instructions are slightly different in each case. There are some subtle ways of handling this, but they usually involve either table-driven programming or an intermediate data-structure. The idea is that you build up data that lists the instructions, varying by each case, as you go. Once the structure is complete, you walk through it executing everything. Simple lists fit into statically created tables, while more complex logic can be dynamically created as lists, trees or even DAGs. You'll find this type of code at the heart of interpreters, compilers, editors, graphics programming and some really beautiful domain applications. The if statements are implicitly hidden because of the way the structure was constructed. This is a powerful way to eliminate special cases and to control complexity, while still making it almost trivial to extend the code. For static tables you just add in new rows, for intermediate representations you just add more functionality to manipulate the structure.

That of course leads to employing polymorphism in OO languages. The special cases disappear because everything is handled by overrides of basic primitive methods that are specific to each class. That is, the blocks in the if/else statements form the methods implemented in the objects. Then you get a 1:1 relationship between the smallest blocks needed and the primitives necessary to solve the problem. Set up correctly, expanding a polymorphic hierarchy requires as little thinking as creating deeply nested code (which generally means you have to be careful) but the end result is still organized. To go a bit farther, you can even reuse common code blocks via inheritance, which saves on typing and testing.

In particullary messy systems where there are an explosive number of permutations that all might need to be handled someday, polymorphism is a really strong approach. Even before OO we'd implement the same mechanics in C by using table-driven function pointers that implemented the primitives. Then to change processing for different sets of data or configuration all you needed to do was update the table with a new set of function pointers, thus making it easy and reliable to extend to new cases. Setting up that type of organization doesn't take much time, but it saves massive amounts later as the code grows.

Sometimes the special cases are very very hard to remove. For that we have to go up to abstraction, and then manufacture virtual elements to smooth out the logic. This is a tricky technique to explain in general terms but the basic idea is to add in things that doesn't really exist, but if they did then there wouldn't be any special cases remaining. A very simple example of it exists in the "The Art of Computer Programming" where Donald E. Knuth documents a couple of  algorithms for linked lists that always have a head node (circular and doubly linked). That is, the list is never empty, even if there is nothing in it. This simple trick reduces the logic down to something quite manageable and provides several neat little micro optimizations. It's a very small "space vs. time" trade-off that's unfortunately been mostly forgotten.

I've had to use this abstraction technique many times in the past to simplify difficult, yet performance dependent logical flows. The different cases are tantalizingly close to each other, but they don't actually fit together. It can take a bit of creativity and patience, but eventually you stumble onto some virtual piece that ties it all into one. The key is that the code actually gets simpler, if it doesn't then you need to toss it immediately and try again. Almost always it has taken a couple of iterations to get right, but in the long run it has always been worth it.

If you look carefully, you'll see this technique in all sorts of interesting places, but because it rounds out the code so nicely it is easy to miss. Still, if you really want to write something sophisticated, it is an indispensable part of the your toolkit.

Sometimes branches are just unavoidable, but that doesn't mean they have to gum up the code. Like limiting scope in variables, you can push a branch down to the absolute lowest point in the code. It wraps only the absolute minimum instructions that differ. Doing that is often enough to get it encapsulated away from the rest of the complexity. Another strong way to handle stubborn branches is to move them to the earliest part of the code where the data comes in. It has always been a good idea to only convert and verify data 'once' upon entry from an external source, thus avoiding needless fiddling and redundant checks. This is also a great place to add in variables that avoid branching. Placed there, the branch can set up a variable that flows cleanly through the rest of the main logic.

Most special cases are unnecessary. There are the very few rare ones that are completely unavoidable, but long before accepting any new special cases into that category all effort should be made force fit them into the mainstream properly. Special cases match the way people think and they get created quite naturally from various domains, programmers and the analysis process. That is to be expected, so it is up to the system's designers and its programmers to merge them back into as few cases as possible. Although it's easy to just whack out more special cases in a code base, it is one of the fastest and most dangerous forms of technical debt. Intrinsically each one is bump up in disorganization. It's also mindless, and in software anytime something isn't thought about explicitly, it usually manifests itself into a problem. We got into software development because we like to solve problems by thinking deeply about them, oddly trying not to think deeply about them is counter to what most people claim they enjoy.

Thursday, April 16, 2015

Shades of Grey

Every profession changes its practitioners. How could it not? They'll spend a big part of their life approaching problems from a given angle, applying knowledge built up by others with that same perspective. Prolonged immersion in such an environment gradually colours their views of the world.

Programmers encounter this frequently. Coding is all about engaging in a dialogue with a rigid machines that don't tolerate ambiguity or understand intentions. They do exactly, and precisely, what we tell them to do. There is no greyness, no magic; the machines only follow their instructions as given and nothing else (if we ignore issues like faulty hardware and viruses).

This constant rigidity leads us to frame our instructions as being 'right' or 'wrong'. That is, if the code compiles and runs then it is right, else it is wrong. There is nothing in between.

Gradually this invades how we see things. We quickly go from the idea that the instructions are right, to the idea that we've built the right thing for the users, and then we talk about the right ways to design systems and run projects. But for many, it doesn't just stop there. It continues to propagate outwards, affecting the rest of their views of the world. And the way they interact with it.

A long, long time ago, on learning the basis of first learning object oriented, one of my friends declared "There can only be 'one' right object oriented design. Everything else is wrong". I giggled a bit when I heard this, but then I went on a rather prolonged ramble about the greyness of trade-offs.

I explained that known issues like space-time trade-offs meant that there were actually a great number of different, yet equally valid designs, all of which should be considered 'right'. That some of those trade-offs might be constrained by known user or operational requirements, but that there were many more that differed only because of 'free' parameters.

My friend was sceptical at first, but he came around when I began talking about different 'sorting' algorithms. Ones with very different computational complexities in the best, average and worst categories. They all sorted, quite efficiently, but the whole collection of behaviours was considerably varied. Optimizing for any one aspect of a complex system always offsets some other aspect. That is, one is traded off for the other.

The simplest example of this is 2D vectors on a Cartesian plot. If magnitude is the only constraint, then there are an infinite number of vectors that are all basically identical. It doesn't matter where they are on the plot, or at what angle. A single constraint in n variables maps lots of stuff onto each other.

Getting back to software, it's discrete, so there is actually a fixed set of 'good' object oriented designs, but it is a really huge set. Outside of that set there is a much larger one of all of the less than correct designs. But that still doesn't make any one specific design 'right', it just makes it one of the many 'appropriate' solutions. "Right" is definitely the wrong word to use in this context

When I started programming, the big decision we had to make was whether we were 'vi' or 'emacs' people. Both editors requires significant effort to learn to use correctly. Emacs was seen as the grander solution, but vi was pretty much available everywhere and consumed a lot less resources. On the early internet, wars raged between the proponents of both, each pounding on their keyboards about the rightness of their view. That of course is madness, editors are just a tool; the different varieties suite different people, but almost by definition there is no one perfect tool.

There is no 'right' editor. There are certainly some crappy editors out there, but they are invalid only because they are plagued with bugs, or are missing core functionality. They are deficient, rather than 'wrong'. If you can do the basics of editing text, then an editor is functional.

In that same sense, we can talk about lines of code being right, in that they actually compile, and we can talk about compiled code being right in the sense that it actually runs, and perhaps we could include 'proofs of correctness' to show that the code does what we think it does but beyond that it quickly becomes grey. There is no right program for the users. There is no right architecture. No right way to build systems. As we get farther away from the code, the shades of grey intensify. Right and wrong quickly become meaningless distinctions.

A quick perusal of the many discussions on the web about computers out there are predicated on this mistake. People argue in general about how one end of a trade-off is so much better than the other. But it doesn't make sense, and it siphons away energy from more meaningful discussions.

A black and white perspective comes with the early stages of learning to program. It is very easy to let those views cloud everything else. I fell into this trap for a long time. Gradually, I have been trying to correct my views. Right and wrong is far too rigid to use in being able to make sense of the world. It certainly doesn't help us understand how to interact with people or to come together and build sophisticated computer programs. It doesn't make life any easier, nor does it make anyone happy. It's sole purpose is to help down at the coding and debugging level, but once you've learned its place, you have to learn to leave it where it belongs.

The safest thing to do, is to not use the words 'right' or 'wrong' at all. They tend to lead towards missing the essential nature of the underlying trade-offs. It's far better to come at complex systems and environments with an objective viewpoint. In that way, any specifics might have better qualities, at the expense of known or unknown side-effects.

Another easy example is with computer languages. Some languages make doing a task faster and more understandable, but all of them are essentially equivalent in that they are Turing-complete. For a given problem, it might be more naturally expressed in Lisp, but it also might be way easier to find Java programmers that could continue to work on the project. The time saved in one regard, could be quite costly in another.

More to the point, the underlying paradigms available as primitives in Lisp could be reconstructed in Java, so that the upper-level semantics is similar. That is, you don't necessarily have to just do lispy things only in Lisp, they are transportable to any other programming language, but you have to take one step backwards from the language in order to see that. And you can't be objective if you are dead-set against the inherent wrongness of something.

Putting that all together, although at the lowest levels we need to view things in a black and white context, at the higher levels labelling everything as right and wrong is debilitating. If you're discussing some complex aspect of software development and you use the word 'right', you're probably doing it wrong.