Thursday, May 15, 2008

Hard Code'n

I think that at this stage in our industry, it is important to differentiate between several key, yet very different parts of the software development process. Specifically, I see a huge difference between "software development", which includes design, development and deployment of software, and "programming", which is focused on completing a set of instructions in a computer language to implement some functionality. One is the all encompassing act of creating software including every aspect from beginning to end, while the other is a very specific subset of the process that focuses on writing code to implement some set of algorithms.

In many ways I see this division as similar to accounting vs. bookkeeping. Bookkeeping is an important part of accounting, but it doesn't necessarily have to be handled by a fully-trained accountant, in fact most bookkeepers I know are not accredited accountants. Accounting includes far more than bookkeeping, but bookkeeping is an essential part of it. There is even a "higher" side of management accounting, which still deals with the science, yet only at a very high management level.


If I am going to split one out from the other I need to carefully define them or risk the wrath of the net (or even worse, silence). I see programmers as taking descriptions of functionality and making them into code. Software developers on the other hand, analyse specific user domain problems and then design and implement solutions to aide the users in building up their ever increasing piles of data. In that way, programming is just a tiny part of the overall software development. It happens somewhere in the middle. It is the process of "encoding" some functionality into a language as a long set of instructions and doing some testing/fixing to make sure it works. Everything else is software development.

I'm well aware of how our industry and programming culture love to mix together analysis, design, requirements, coding and testing all into one giant lump; for most people these are one in the same operation. Just another day at the office.

I think that mixing these together is a huge mistake because the skill sets are very different from each other. Not to trivialized it, but programming -- such as implementing a function to calculate the Fibonacci sequence -- is reasonably well-understood and well-established. Depending on the functionality, there exists an algorithm or not. At worst, implementing the functionality may require the use of several different algorithms all combined together or modified slightly. Generally, for most types of code, examples already exist and can be modified to fit. The problem of function -> code can have it challenging moments, but ultimately it is not a hard problem if you know what you are building. The key is in knowing.

That is why I really like this differentiation. You see, for all of the software developers out there when they are discussing whether or not it is an art or a craft, or intrinsically hard, etc. what they tend to do is blur the line between analysis and programming. Analysis is hard because we don't know what the users need or what will actually work, but once having settled on a specific algorithm, "coding" it is not all that challenging. Sometimes it involves a bit of research (or should), but after that it's just work.


Well, sometimes. If you sit on enough development teams you quickly come to realize that many programmers have serious weaknesses. We, as techies, love the intricacies of tiny machinery like watches. All those little dials and gears and little things appeal to most programmers at a low level. Not unsurprisingly, our single greatest problem while programming is the tendency to "over-complicate" our solutions. We drift towards pedantic, complex solutions that come from over-thinking the problem. We like the fiddly bits, so we add them wherever possible. We are also "option" happy, adding in tonnes of them that never get used.

You'll see it so often in most code, tonnes of unnecessary variables, conditions, loops. Redundant copies, extra layers of handling, and buckets of "glue" code. Fiddly little bits on diagrams, excessive casting, big ugly useless comments, 2 inch thick designs or manuals, etc. Even programmers who love to call themselves lazy will frequently implement 5,000 lines of code when a mere 200 might do.

It is an epidemic problem with programmers, and I've never met one that wasn't guilty in some form or another. If you think you don't do it, then you've probably not been coding long or hard enough; the simplest, most elegant answer is far simpler and more elegant than most programmers have even begun to realize.

Not only do we constantly over-shoot the code, we also build intricate and complex solutions that drive our users nuts. They're often looking for a quick simple solution, and instead we've build some monolithic all encompassing power-hungry solution were even the simplest bit requires the memorization of masses of new terminology and a three-week course on how to apply it. Manuals, they like to say, are only there to document the design flaws. A reasonable viewpoint, I think.


Getting back to programming. If indeed you understand the steps necessary to implement your specific functionality, then it is not a particularly hard endeavour. In the end, for most languages, it's some number of variables, a clump of conditions and a few loops, the fewer the better. Programmers "love" to dive into writing some complex code, but most often its either really simple and straight-forward, or there is a well-known algorithm to handle it. Most code is just tying things together and converting between one physical structure of the data and another. These days, the really complex stuff is buried in libraries, far away from most programmer's hands or eyes.

Even more simply, you can see any type of functionality as a transformation on some data. That makes it almost trivial: the data exists in the system or it needs to be loaded, then some algorithm is applied to transform it into some other structure. Then it is saved and/or written out. Programming, from that perspective is not particularly complex; unless we choose to make it so.

When it is complicated, we tend to find really simple reasons why that is true. The most common is that the programmer is making it too complicated, either they've misunderstood the problem or they've misunderstood how the tools work. I've seen enough programmers "flailing" at their keyboards over the years. There is some abstract aspect to programming that some people just never grasp, while other have to work hard to get better at it. Mostly, I think it's some type of anxiety, where people "think" that the problem is hard, so they skip right past the simple solution and start making it really complicated. A kinda of programming fear-of-failure delusion. "It just can't be 'that' simple" we like to tell ourselves.

There are many people afflicted with this type of problem, but fear not if you are one, for most of them coding gets simpler and easier with practice. The real trick is to keep going back and "simplifying" the code, not "adding" to it. E.g. if it doesn't work, don't try to "add in" more logic, instead start stripping it away until it is smaller and simpler. Removing code is the best tool for debugging. It may seem like a slower approach, but it is way way faster than flailing at it. I had a boss once that taught me by leaning over my shoulder and hitting the delete key over and over again. He'd nuke it and make me type it in again. It was the best programming lesson I ever learned (by the third time, you really get it).


Beyond intricate, some programmers gravitate to "clever". They get pulled into really clever ideas that seem like they are going to work really well. Well, at first they seem great. The problem with clever is that it is an extremely "low" level of working. Clever is not simple, in fact it is nearly the opposite. It's a little bit of concentrated complexity all nicely bundled up into a neat programming package. That might work for writing, but it's the type of thing that you come back to "months" later and instantly regret.

Clever you see is just a waste of time at some point in the future. The problem is that to get to something clever, you probably had some cool inspiration. A light went off in your head, or a neat idea popped up in your mind. That's great, but it's not the normal way of thinking. Generally that causes a type of compressed complexity, a neatly packaged clever idea. That makes it a land-mine waiting to get stepped on.

Someone can easy mistake the point or functioning of the code, and in all likelihood unless your lucky enough to get fired, someday, at some point, when you least expect it, you'll have to go back in a rush and try to fix some stupid problem. That, by the way, is always the case with clever. You are essentially just setting yourselves up aren't you?

Given that, however, "abstraction" is not clever. It is a generalization of the purpose of the code, not some cute little syntax trick or something else tricky. When I say clever is bad, some times people take that to mean that "brute force" is good, but that's hardly what I mean either. Pounding out each and every instruction is a huge waste of time, and it's hard to maintain. Brute force is to specific and too large. Clever is too compressed, it took longer to write it, and it's a land-mine.

Good simple short code -- the definition of elegance -- that works at a reasonable level of abstraction so that it can be leveraged, is what does the best for the long term goals of a software development project. A great programmer is someone who can take a hard problem and make the resulting code look simple. It should be so obvious that it doesn't look like a lot of work.


Another really common problem draws its strength from our unfortunate desire to see programming as an 'art form'. You meet enough programmers who don't want to be engineers, so that don't want any process of any kind. Worse, still, they want the creative "right" to pick a new and unique way of solving each problem, each time. Even if its the same problem over and over again.

And so, by their inconsistency, and the lack of structure they create around themselves an ever increasing vortex of complexity. Mostly you see this with the cowboys, and their fast, yet dangerous band-aid approaches. Cut and pasters are another entertaining variety.

It's quick, its fluid, it works for a while, but like any continual short-term strategy it builds up to the point where it becomes an uncontrollable nightmare.

Fundamentally software development is engineering. We are building something, and we do need to balance out the long-term work with the short-term pressures. Software is saved by the fact that its total ugliness is not visible (if it were there would be a lot of "fired" programmers), but that doesn't mean the effects won't be visible. When you are building "anything" you can only cheat for so long before it becomes unworkable. Sure, if it is a short "assembly" job of combining some pieces together to whack out a simple application for a couple of months, you can get away with a huge number of short-cuts, but once it becomes a multi-year, multi-developer project, each and every short-cut (even the ones that you don't think are actually short-cuts) builds up.

If and when they build up enough, they account for a significant number of project failures. Sadly, "sloppy process" failures are entirely preventable, but only by people who understand them.


If it is not the programmer, or the chaos then it is the functionality itself. It's either poorly specified, or perhaps even just a really "bad" idea. The real trouble in programming doesn't come from feeding in lists of instructions into an abstract machine for execution. Nope. It comes from tying that back to the "real world".

People are irrational, messy and the source of huge problems. If the functionality is not well-defined or it is not "workable", the core reasons behind it almost always come down to people, whether it be limited thinking, politics or egos, it doesn't really matter it's all the same.

All software ultimately is for people to use, so it is actually easy to get the functionality back onto the right track: "pick something simple". Then specify it, in some format that makes it easy to see if it's complete or not. From there, it's just back to programming.

Once in a while, in order to get the system running, the core contains something extremely complex. Generally this is some type of engine or parser or processor or something extremely hard. The really heavy duty programming can be tough, particularly if it is breaking new ground, but it rarely accounts for even a significant percentage of the overall system. Writing good heavy weight code generally involves a strong understanding of some complex discipline or the actual problem domain. Ultimately thought, even the most complex "engine" breaks down into a large number of simple functions. The trick is not writing the pieces, it is getting them to all work together in some intricate, yet simple and elegant solution, a problem which is clearly "architectural" in nature and not really programming.

What hooks a lot of people is that they tackle complex functionality without considering architecture, so the result is a lot of hit and miss attempts to get it all working together properly. If you build the mechanics into the architecture at the general level, then the lower-levels are just specific algorithms to transform data from one stage in the process into another. The code doesn't really fail, it's the architecture that convoluted the process and makes it messy.

No architecture? No wonder your having problems. You wouldn't build a house without first designing the internal frames, so why wouldn't you do the same for your code?


Programming, then by itself, is relatively simple. That's hardly surprising as you find that in a lot of specific problem domains, many of the programmers are actually domain experts, not Computer Scientists. You don't need a Computer Science degree to write code. In a very real sense, that is why it is closely aligned with bookkeeping, even though I realize that a lot of people might take offense at that comparison. But, like it or not, great reams of domain-specific code is easily written by other disciplines. And, even more horrifying to admit, for you basic bread and butter medium-weight programming work, a degree in computer science is over-kill. You don't have to know about Turing machines to create a screen in a GUI to accept human resources data. You don't need to understand the halting problem to write a social web-app. The expressibility of SQL; does it really matter?

These things have their place in software development, but not necessarily in most programming, they usually only come into play in the core of the technical aspect to a solution, something that is generally wrapped in a framework or infrastructure.

Programming still has it moments when time is tight and you are having trouble focusing, but for most people, after about five years of steady coding it mostly becomes instinctual. I know, there are still readers out there that have been at it a longer time and are still struggling, but if they are fair about why they are struggling, the reasons come down to not really knowing what they are building, as they are building it. It's personal, architectural or analysis, not programming. Really it's a bigger problem.

Software development, on the other hand is extremely young, completely unfinished, and extremely complex. It's the type of thing that people just don't get, and is really hard, even at its simplest level.

You learn this, intrinsically, when you end up in meetings with users who are insisting that the software work in a specific way, while you are quite aware that it is impossible. Not just difficult, but completely and utterly impossible. Yet, it becomes very difficult to explain why it won't work. The certainty is there from experience, yet the ability to simplify it and pass that knowledge onto to someone else is lacking.

In that overlap between people and mathematics, the grey area in there is a largely unexplored, unknown world of fantastically complex problems that we haven't even begun to enunciate yet, let alone tackle. We missing at least one if not many different sciences that make up the knowledge needed to build "reliable" complex systems. We're pretty much guessing at it right now, when we should be far more knowledgeable about what works and what doesn't.

Still, while there are many great problems left to solve in Computer Science, and there is still a whole 'process' left to create to solve the on-going "software crisis", the act of programming is not among the key problems. Our biggest issue with programming is our constantly confusing the issues, and trying to fit a one-size-fits-all approach to unify programming and software development. Getting back to my initial point, if you see them as different, then it becomes easier to see and deal with understanding their own unique issues. A bit of structure can be a grand thing.