Friday, May 2, 2008

Software Blueprints

Over at www.hans-eric.com -- Hans-Eric Grönlund's most excellent blog -- an interesting discussion occurred in the comments for the blog entry "is agile only for the elites?":

http://www.hans-eric.com/2008/03/28/is-agile-only-for-elites/

I got into a conversation with Jack Repenning; we were talking about the ideas of Alan Cooper. The focus arrived at the point where Jack questioned whether or not blueprints for software can even exist. It is a 'big' topic, one that does not easily fit into a comment box, but one that I have given a great deal of consideration to lately. I thought, given the timing, that I could elaborate on my thoughts in a full blown blog entry.

The crux of the matter is whether or not there exists some way to create a blueprint from which a complex piece of software can be developed. To an outsider, that may seem like an easy question: "of course" would be their likely answer, but if that were the case one would expect that 'blueprints' in some fashion or other would already exist in the standard software development environment.

We do have lots of documentation, and big projects often have detailed designs, but most commonly the ground-floor programmers ignore the designs in their quest to actually get the code up and running. The belief for why this occurs is that there are too many details in software, which are constantly changing as the code is being built. This swirling vortex of chaos invalidates the design, long before the programmers ever get close enough to starting, rendering the bulk of the design effort moot.

Instead, the more modern idea in software development has been to work with small tight iterations, some upfront design, but essentially you set the project free and let it go where it wants. The small cycles mitigate the risks, and the provide quick feedback to the team, so that they can change with the prevailing winds. Thus the chaos that destroys the usefulness of the grand design / master plan is harnessed as the driving force behind the short iterations. A nice idea, but limited in some ways.


DEEP, DARK AND DAMP QUESTIONS

There obviously are a lot of complex ideas and questions floating around in our development processes. To get at the root, we need to find a good solid place to start, so the first and most obvious question is: "whether or not we can actually create a blueprint for software". By this I mean two things: a) a representation of the system that is 'small' enough that its flaws can be detected by observation and b) the validity of the blueprints stays intact throughout the entire development; the chaos does not significantly damage it. Without these two things, the blueprint is just a management exercise in wasting time. If it isn't small enough to provide value or it is out-of-date even before it is finished, then it is not worth considering.

So our idea blueprint is just a summary of the non-volatile details at a level that can be used to deterministically build the lower levels. Not all things need to be specified, but those things that are not, will not derail the project. I.e. if the programmer picks 'i' for the name of their iterator variable, instead of 'index' the net effect will be the same. If the programmer picks a bubble sort algorithm instead of a quick sort one, the net effect will also be the same. If the programmer chooses a drop-down list, instead of a table, again the net effect will be the same. If a programmer changes the way a major specified formula is calculated, there will be trouble. If they change the way the components are structured, again the results will be bad. The details in the blueprint are the necessary ones.

We can skip past the general question of whether or not a blueprint can actually exist. Jack Reeves put forward an interestingly unique idea twenty years ago when he suggested that the 'code' was the blueprint:

http://www.developerdotstar.com/mag/articles/reeves_design_main.html

it is a fascinating idea: the programmer is the designer and the computer is actually the worker. In that sense, even with modern interpreted languages, the code in the source code control could be considered the blueprint while the packed, distributed and deployed systems are the final products. Packaging is manufacturing. Interpreted or not, the idea doesn't change.

The great flaw, I think, in this perspective is that a huge advantage of doing a blueprint is to check to insure the major bits are all correct and in place long before the work starts. That implies that a blueprint, to be useful, must be a small compact representation of the 'final' design. Going back to my comments with Jack Repenning, I think he was surprised that I said one should be able to create a blueprint for an operating system in six months to a year. The key point in that, was that anything longer was probably not short enough to be able to get the maximum benefit out of doing the work to create a separate blueprint in the first place. The work needs value to be effective. No value, no need for a blueprint. As such, than I easily expect that if a format for creating a useful blueprint really exists for software, specifically in this case for a new operating system, that it is absolutely should not require much more than a year to get down the details. The longer the time, the more useless the work, but I will get back into that later (the time doesn't have to be from scratch).


A PILE OF UNIX

UNIX keeps popping up in this discussion not necessarily for what it is -- although operating systems are very complex and getting more so all of the time -- but because it is the root of all programs, is extremely well understood on a technical level, contains minimal messy business issues and for the most part is extremely well-documented. Yes, well documented. If you hit the book store (and find a few out-of-prints), you can gather a collection of books on UNIX that include the Steven's book, the list of the internal data structures, man pages, a reference book for every major tool in UNIX, and a few for non-major ones. Throw in excellent books like the AWK one, and another couple on shell programming, and you have a huge number of books on UNIX, in general, and in specific.

So imagine one day, that absolutely every 'bit' in UNIX is erased. Poof. Gone. Destroyed. Dumped into the bit bucket. Lost forever, with absolutely no way to get it back. If we ran around the various libraries (and a few houses) and gathered together all the books on UNIX that we have, would that be enough information for us to re-create it?

The answer is emphatically: yes. Linux is a near-perfect example, as it was based on second-hand knowledge of Minux and other UNIXes. We could assemble a team of programmers, give them the UNIX books and set them to work recreating the software. Of course, depending on their focus on the details in the books, and their interpretation, the results would be a little different, but UNIX would once again exist, and it could be similar to the original. They might even fix a few small systemic problems.

From Jack Reeves we know that depending on definition, there is a design, and from the pile of UNIX books we know that it doesn't have to be in the form of 'code' as we know it. More over, from the books we know it can be full of ambiguities, and human-based inaccuracies. It just has to be enough to get the programmers 'there' but it doesn't have to be perfect, that is not its goal or function.

The obvious problem with the pile of books is that it is too big by far to be useful. With something that size it is easy for human nature to take over and let the programmers start creatively interpreting what they read and 'fixing' it, spiraling the project into the never-ending effort that essential fails to get done. There is, it seems a correlation between the size of the design and the desire for the programmers to ignore it and go their own way. Anyone on a huge mega-project has probably felt that at some point or another. But, that is really an issue of discipline and organization, not content.

Still, if we cannot find something small enough, we do not really have solid 'working' blueprints.


THE REALLY BIG GUYS

This seems like a good time for a tangent to me. Why flow linearly when we have the freedom to jump around a bit, it is far more exciting.

I wish I knew more about the evolution of our modern construction techniques, starting with huts and ending in skyscrapers. Construction and design are incremental, skyscraper designs have been evolving over the last 100 years, probably starting long before the Eiffel tower. They built the tower in the world's fair as proof that a steel structure was viable and would allow buildings to exceed a simple three floor minimum. This was a momentous leap in construction. One step in the many that have lead to our modern behemoths. Skyscrapers, then didn't just spring into existence, they evolved, design after design. Each time getting larger, and more complex.

What makes them so fascinating is that they are phenomenally complicated structures to build, yet even with that, once started they usually get done, and they rarely fall down. It is absolutely that type of 'record' that makes any old-time programmer drool. What we wanted twenty years ago, what we hoped for, was the ability to build complex systems that actually worked. And to build them in a way where we weren't just taking wild guesses at how to get it done. Should any piece of software achieve the elegance and beauty of even the ugliest skyscraper, with a comparable amount of complexity, that system would just blow us away. They don't guess about minimum concrete thickness and hope to get it correct. We do.

But then again, I am guessing (sorry it is habitual, as part of the profession). Skyscrapers are designed but they don't really start from scratch with each design either. There is an evolutionary process where the buildings grow on the principles and standards of other buildings. There is, something to that, that we need to understand.

I know it takes years to validate the designs, and usually longer than a year to actually build it, but if you put all of the man-years of effort to get the 'core' of the skyscraper built up against any one of our longer running commercial software packages, I'm tempted to guess that there was actually less time spent on the skyscraper. We pour a tremendous amount of effort and testing into many of our commercial products; a staggeringly mod-bogglingly amount of effort if you trace some of the more notorious ones right back to their origins. They are huge sinkholes for effort.

I've never seen the blueprints for a skyscraper, but I'd guess that they are a lot smaller than my earlier pile of UNIX books. We should marvel at how they build a building that large and complex, with ways less documentation, and big distributed teams of multi-disciplinary specialists, while insuring that the quality of work is good to excellent. Complexity for complexity, pit that against an equally sized software project, and consider that the initials odds of the code even getting partially finished are way less than 50/50. What have they got that we don't?


BRING IT HOME TO US

From this viewpoint, I find it hard to believe that there isn't some obvious form of blueprint. After all we definitely know it can exist, it's just that in that case it is too large to be useful.

One of the favorite arguments against the existence of blueprints is the circular one that if it were possible it would exist already. That ignores two keys issues a) computer science is still young, so we've barely started building things and b) the biggest works of computer science have been behind closed doors. In the second case, someone may have already come up with the perfect format, we just aren't aware of it yet. However, given the nature of the industry, this type of thing has little IP and big bragging rights, so its likely that unless their was fear of Microsoft getting their hands on it and wreaking havoc, it would have made it out into the general public pretty swiftly.

To me a more likely explanation is culture. Right from the beginning, programmers have been trying to distance themselves from engineers. It's that inherent desire to not be highly constrained during development that is probably the most likely explanation for not having blueprints. We don't have them, because nobody is looking. It's opposite to the hacker culture. The freewheeling chaos of the agile extremist movement is the type of dynamic environment that most programmers really want to work in. Work should be fast, fun and flexible.

While it's hard to argue with that, I do find that fast, fun and flexible often leads to 'fucked', which is a downer. I guess as I've gotten older I am less strongly motivated to whack out code and more motivated to build complex, sophisticated machinery, that I know -- not guessing -- will do the proper job correctly. What good is a dynamic environment if none of the stuff actually works? Even if you enjoy work, are you really proud of 'that' mess you made in the code? You knew the right way to build it, so why didn't you? Is it always somebody else's fault?

So, if we can get past our biases and imagine both a world were blueprints really do exist -- but not at the cost of making coding some 'dreaded cog' like position -- it is easier to realize that there aren't any easy concrete reasons why we don't have or use blueprints. It works for some really big and complex things, it could work for us too.

More interestingly, at least for myself, if not for a large variety of other programmers, I always visualize the code before I set down at the keyboard. Internally, I can see what I am writing, but I don't see it as a massive sequence of steps to execute. I see it as some sort of 'abstract machine', but it is in my head without real physical manifestation; indescribable, but always enough to allow me to run through its operation to make sure it will work correctly. So I know that 'it' can be smaller, and 'it' can fit in my head, but unfortunately for me, I have no idea how to pass 'it' onwards to other people.

Also, human as I am, sometime my internal design conveniently skips a point or two in physics, so that making it work in the real world is similar but not an exact translation. Still, it is exactly that type of internal model that has ensured that the systems I have worked on over the years start their development with a better than fighting chance to survive. When they didn't make it to a release, it wasn't coding problems that ever brought them down.

Guessing at what works, and knowing it are two different things altogether. The importance of removing the 'guessing' from computer science cannot be understated. It is the big evil fire-breathing six tonne dragon sitting in the room with us, each time we get passionately talking to management about how we are going to solve the next problem. We 'think' we know how to do it, but ultimately as we all cry during the estimation stage, were not sure because we've never done 'this' before.

Oddly having some established place to start wouldn't be all that bad. if you could take something you know worked, and then use it to push the envelope, the stresses would probably be way less. All of that FUD that comes midway through a big development, as the various parties are starting to lose faith, that could be avoided. The last plan at least should have been a working system, so there is always a fallback if things are too ambitious. It is these types of attributes, plus the ability to set lose the design on a large team and 'know' that it will be built correctly that should make all software architects extremely envious of their construction peers. That type of certainly only belongs to programmers that successfully delude themselves, or those who have actually finished their third system -- end-to-end -- without fail. The latter being an extremely rare breed in software.


THE NATURE OF A BLUEPRINT

A short, simple model to prove that a specific design will work as predicted is a small thing in most industries, but a huge one in programming. Sometimes when it is hard to visualize, I like to go at a problem by addition and subtraction. Vaguely, what is it, what is it not? In this, there are attributes that a blueprint must absolutely not have:

- too much detail, wasted effort
- too many pretty charts, seriously wasted effort
- things that don't make large or architectural differences

but there are some attributes that have to be there:

- all details that really make a difference (pivotal ones)
- how to handle errors, every program fails, must deal with it
- all vertical and horizontal architectural lines, (and why they exist)

I'm sure there is more, but one shouldn't take all of the fun out of it. Whatever is there, it must be based around a minimalist view. Too much stuff doesn't just waste time, it doubles up its effect by obscuring the details that really matter.

To this I should add a quick note: with any complex business problem, there is an inherent amount of chaos built directly into it, that is constantly changing. We know this, it is the Achilles heel that brings down a lot of projects. The agile approach is to embrace this and change with it. My alternative is to take the changes themselves and use them as the lines in which to draw the architecture. So, instead of getting aggravated with the users as they ping pong between a series of options, you simply except that allowing the ping-ponging itself is one of the requirements. In that way, instead of running from the chaos, you turn and embrace it as part of the design. Yes, to some degree it is more expensive, but if you consider that the alternative could be failing, then it is far far cheaper.


A WILD GUESS

With the above set of attributes, I can make a completely wild guess as to what should be in a blueprint.

In a sense, you can see a function in a computer as a set of instructions tied with a context. If the code is brute forced, then for each and every function you have an explicit listing of all of the steps. The more modular, the more code that is shared between functions. The more generalized, the higher and more abstract the code. In any way, there is some low level of detail within the steps and their arrangement that is needed to actually make the code work, but there is at least one higher level of detail that imparts an understanding of how the code works, without actually explicitly laying out all of the details.

In a sense we could split it into micro-coding, the actual list of instructions in a specific language, and macro-coding, the essence of creating the list in a higher representation. A function written in macro-coding is a simplified summary of the 'main' details, but is not specific enough to work. It needs, added to it, all of the micro-coded details. Pseudo code is one common form of macro-coding, but its general practice is still fairly low-level. The value, in this is to find that higher level expression that still defines the solution to the problem, without going to far and disconnecting the programmer.

A useful blueprint, then is the 'shortest' high-level representation of the code that is able to let one of more humans explicitly 'understand' what the code is going to do, without being detailed enough or rigorous enough to actually be the code.


FINAL THOUGHTS

The definition of insane -- they like to tell us jokingly -- is to continue to do the same things over and over again, but expect a different result. Given that, by any standard, the better part of the whole software industry is totally 'whacked'. What started as a 'software crisis' well over fifty years ago is a full-blown software calamity. We depend so heavily on this stuff, yet we have no idea how to actually build it, and every new generation of programmers goes back to nearly ground-zero to just remake the same crappy mistakes over and over again. We are in this endless bad loop of churning out near-misses. Things that 'almost' work. Stuff that kinda does what it is supposed to, so long as you don't 'push' it too hard. Bleck!

Mostly, skyscrapers don't have even a fraction of the types of problems that our best examples of software are plagued with. Yes, time is a key difference, but also the evolutionary cycle of just enhancing the last design a bit is clearly a big part of it. Each step is validated, but the next step can make real leaps. Slowly, but surely the buildings have improved.

Another reason is that if they built skyscrapers using the same type of twenty year process of just 'slapping' on new parts, in much the same way we try to 'iterate' from a starting point into something greater, the building would be so damned ugly and scary that the local government would go after the owners and make them take it down, either because it was an eye-sore, or because it was just plain dangerous. Most software is fortunate that it is not visible to the naked eye or it would suffer a similar fate.

The big problem with software is that we are not learning from our past mistakes, yet we are frequently exceeding our own thresholds of complexity. We build too big, too fast, and we have no way of going back and learning from it. A single programmer might grow over a long career, and help move a team into building more complex things, but we really are an industry that puts our older programmers out to pasture way too quickly, so there is no history, nothing to build on.

Blueprints then, wouldn't just leverage a small number of people's expertise, they would also allow retrospective mining of the strengths and weakness of various systems built over the years. That's the type of knowledge that we are sorely missing right now, yet we could easily use. It leaks out of the industry faster than we can restore it. We aren't even leveraging 10% of a computer's ability, and we are running around like chickens with our heads cut off just to keep our existing poor band-aid solutions from falling over. We really need to find a way through this.

Nothing is more important than using our time wisely. And that comes from not just winging it, or guessing and being wrong. Luck is a harsh mistress. The difference between being able to hack out a quick solution that hopefully works, and being able to carefully assemble a solid working solution to a problem is absolutely huge, but a completely misunderstood distinction for the industry. Setting down one person's vision, then evolving it to the next level, in a way that is transparent, and documented is such a huge fundamental change to software that if we were to find working blueprints the consequences to not only our industry, but also are societies would be immense. An evolutionary leap worth being a part of.

In the end though, this isn't only about programmers wanting to leverage their design abilities so that they can build bigger systems. Instead this is about freeing our industry from an ongoing crisis / dark-age and allowing us to finally utilize the hardware correctly to build truly wonderful software that really does help its users allowing them to reach higher levels of productivity and understanding. You know, all of those platitudes that vendors have been tossing into their commercials for years and years, that just haven't materialize in our current technologies. A promise we made, but are unable to keep.