The Programmer's Paradox: February 2014

Friday, February 21, 2014

Levels of Code

Throwing together computer code isn't that tricky, but doing it well enough that it's usable in a serious enviroment requires a lot of work. Often programmers stop long before their code gets to an 'industrial strength' level (5 by this classification).

For me the different levels of code are:

1. Doesn't compile

2. Compiles but doesn't run correctly

3. Runs for the main logic

4. Runs and handles all possible errors

5. Runs correctly and is readable

And the bonus case:

6. Runs correctly, is readable and is optimized

The level for any peice of code or system under consideration is it's lowest one; it's weakest link. Thus if the code is beautifully readable but doesn't compile then it is level 1 code, not level 5.

Level 1 can be random noise or it can only be slightly broken, but it doesn't matter. It wouldn't even be worth mentioning except that people sometimes check this level of code into their repositories, so it deserves its own level. Level 2 isn't much of an accomplishment either, but a lot of level 2 code actually makes its way into software unnoticed.

Level 3 is where most programmers stop, they get the main logic functioning, but then fail to deal with all of the problems that the code can encouter as it runs. This includes not checking returns from lower level calls and not being able to cope with the unavailability of shared resources like databases.

Level 4 involves being able to correctly deal with any type of external failure, usually in a way that doesn't involve a crash and/or manual intervention. Networks, databases, filesystems, etc. can all become unexpectantly unavailable. The code should wait and/or signal the problem. Once the resource is available again the code should automatically return to using it properly. Any incomplete work should be correctly finished. If the downtime period is short, except for a log entry it shouldn't even be noticed.

Level 5 is somewhat subjective, but not as much as most people might assume. Realistically it means that some other qualified programmer can come along and quite easy figure how to enhance the code. That doesn't mean the code will remain at level 5 after the changes, but it does mean that the work to change it won't vex the next coder. If they understand the domain and the algorithm then they will understand where to insert the changes. Indirectly this also implies that there are no extra variables, expressions or other distractions. It is surprising how little level 5 code is actually out there.

Level 6 requires a deep understanding of how things work underneath. It means that the code meets all of the other levels while doing the absolute minimum amount of work and it also takes advantage of techniques like memoization when it's both possible and practical. Extremely rare, it is often the minimum necessary for code to be described as 'elegant'. Failed attempts to reach level 6 can result in the code dropping several levels, sometimes all of the way back to level 2, thus the expression about premature optimization being the root of all evil.

Level 4 code is often good enough, but going to level 5 makes it possible for the code to be extended. Level 6 is a very worthwhile and achievable goal, often seen within the kernel of sucessful products, but it's incredibly difficult to maintain such a high standard across a large code base.

In this classification I haven't explicitly dealt with reuse or architecture, but both of these are absolutely necessary to get to level 6 and certainly help to get to level 5. Readability is clearly impaired if similar chucks of code are copy-pasted all over the place. A good architecture lays out the organization that allows the underlying sections of the code to efficiently move the data around, which supports level 6. In general, disorganized code usually converges towards level 2, particullarly if it is under active development.

Sunday, February 16, 2014

Principles

This post http://tekkie.wordpress.com/2014/02/06/identifying-what-im-doing/ by Mark Miller really got me thinking about principles. I love the video he inserted by Bret Victor at http://vimeo.com/36579366 and while the coding examples in it were great, the broader theme of finding a set of principles really resonated with me. I've always been driven to keep building larger, more sophisticated systems, but I wasn't really trying to distill my many objectives into concrete terms. Each new system just needed to be better than the last one (which becomes increasingly hard very quickly).

Framing my objectives as a set of principles however sets an overall theme for my past products, and makes it easier to be honest about their true successes and failures.

As for principles, I no doubt have many, but two in particular drive me the hardest. One for the front end and another for what lies behind the curtains. I'll start with the latter since to me it really lays the foundations for the development as a whole.

Software is slow to write, it is expensive and it is increadibly time consuming. You can obviously take a lot of short-cuts to get around this, but the usefulness of software degrades rapidly when you do, often to the point of negating the benefits of the work itself. As such, if you are going to spend anytime building software you ought to do it well enough that it eventually pays for itself. In most instances this payoff doesn't come by just deploying some code to solve a single problem. There are too many development, operational and support costs to make this an effective strategy. It's for exactly this reason that we have common code like operating systems, libraries, frameworks, etc. But these peices are only applied to the technical aspects of the development, what about the domain elements? They are often way more complex and more expensive. What about the configuration and integration?

My backend priciple then is really simple: any and all work done should be as leveraged as much as possible. If you do the work for one instance of a problem then you should be able to leverage that effort for a whole lot of similar problems. As many as possible. For code this means 'abstraction', 'generalization' and eventually 'reuse'. At an organizational level this means some architectural structure that constrains the disorganization. At the documentation level this means that you minimize time and maximize readership.

Everything, at every level, should be designed and constructed to get the upmost leverage out of the initial effort. Every problem solved needs to viewed in a much larger context to allow for people to spot similar problems elsewhere.

Naysayers will invoke the specter of over-engineering as their excuse to narrow down the context to the absolute smallest possible, but keep in mind that it is only over-engineering if you never actually apply the leverage. If you manage to reuse the effort, the payoff is immediate and if you reuse it multiple times the payoff is huge. This does mean that someone must grok the big picture and see the future direction but there are people out there with this skill. It's always hard for people who can't see big pictures to know if someone else really does or not but that 'directional' problem is more about putting the wrong people in charge than it is about the validity of this principle. If the 'visionary' lacks vision than nothing will save the effort it is just doomed.

When a project has followed this principle it is often slower out of the gate than a pure hackfest. The idea is to keep building up sets of larger and larger lego blocks. Each iteration creates bigger peices out of the smaller ones which allows for tackling larger and larger problems. Time no longer is the enemy, as there are more tools available to tackle a shrinking set of issues. At some point the payoff kicks in and the project's capabities actually get faster, not slower. Leverage, when applied correctly, can create tools well beyond what brute force can imagine. Applied at all levels, it frees up the resources to push the boundaries rather than to be stuck in a tar pit of self-contructed complexity.

My principle for the front end is also equally effective. Crafting software interactions for people, whether it be command line, a GUI or a NUI, is always slow and messy work. It is easily the most time-consuming and bug prone part of any system. It is expensive to test and any mistakes can cost significant resources in managing the debugging, support, training and documentation. A GUI gone bad can suck a massive whole into a development project.

But an interface is just a way of hanging lots of entry-points of functionality for the users to access. There is a relative context to save the users time from having to respecify stuff and there is often some navigational component to help them get quickly from one peice of functionality to another, but that's it. The rest is just litterally window dressing to make it all look pretty.

So if you are going to build a GUI, why would you decompose everything into a billion little peices and then start designing the screens from a bottom up perpective? That would only insure that there was extra effort in making endless screens with nearly the same bits displayed in a redundant manner. You can't design from the bottom up, but rather it must be from the top down. You need to look at what the users are really doing, how it varies and then minimize that into the smallest, tightest number of entry-points that they need. An interface built this way is small. It is compact. It contains fewer screens, less work and less code. It takes the users quickly to what they need and then gets them back to a common point again in as little of effort as possible. It's less work and they like it better.

A system with hundreds of scattered screens and menus is almost by definition a bad system, since it fails due to its size to be cohesive; to be usable. Functionality is useless if you can't find it. Sure it is easier to write, you don't have to agonize over the design, but that lack of thought comes with a heavy price tag.

Programmers build GUIs from the bottom up because they've been told to build the rest of the code from the bottom up. But for an interface this is backwards. To be effective, the interface has to be optimized for the user, and of course this will make the programmer's job far more difficult, but so what? Good coding is never easy, so forcing it to be that way simply dumps the problems back onto the people we are trying to help. The system should be easy to use even if that means the code is harder to write. And if the work is hard, but relatively redundant than that is precisely what the first principle is for. The difficult bits should be collected together and encapsulated so that it can be leveraged across the entire system. So for example, If the coders spent extra time generalizing a consistent paging mechanism for a screen, then that same code should be applied to all screens that need paging. Ten quick, but flakey paging implementations is ultimately more expensive and very annoying for the users.

It's hard to put a simple name to this second principle, but it could be characterized by stating that any people/machine interfaces need to be designed in a top-down manner to insure that they are optimized for the convience of people rather than for the convience of the construction. If people are going to benefit from software then they have to be the higest priority for its design. If money or time is a problem, less stuff should be delivered, but the priority must be people first.

Both principles echo strongly in my past works. Neither is really popular within the software development communities right now, although both frequently get lip service. People say they'll do both, but it is rare iin actuallity. Early agile, for instance, strongly focused on the end users but that gradually devolved into the stakeholders (management) and generally got pushed aside for the general gameification of the development process. These days it is considered far better to sprint through a million little puzzles, tossing out the results irratically, then it is to insure that the work as a whole is consistent and cohesive. Understanding the larger context is chucked outside the development process onto people who are probably unaware what that really means or why it is vital. This is all part of a larger trend where people have lost touch with what's important. We build software to help people, making it cheap or fun or any other tasty goal is just not important if the end product sucks.