The Programmer's Paradox: November 2007

Wednesday, November 28, 2007

From Our Inner Depths

The biggest problem with the World Wide Web is that it is just too hard to find anything "unique" anymore. In an instant, you can gather together at least fifty hits for just about anything. Are there any two word phrases left that don't at least get one hit? It is hard to imagine.

Despite the existence of many other published uses, I'm still going to proposed a new definition for a pair of old words. Sometimes you can't let prior history hold you back, particularly when the terminology is so appropriate.

If you've been using software for a while, you've probably been there. It is version X of the program; some function you relied upon, while still in the system no longer works correctly. Sometimes it a bug, sometimes it is a legacy problem, but often it is just because the new programmers maintaining the code were clueless. They had no idea how the system is really used in the real world, so they removed or broke something vital. They get so busy adding new crappy features and ignoring the existing bugs that they don't realize how much they are diminishing the code base.

Software -- we know -- starts out needing a lot of features. In its life, often the big programs have many, perhaps hundreds of developers pouring over the code and enhancing it. As the development grows, the quality and functionality get better and better, until at some point it reaches its maximum. Never a good thing. It is an inverted curve, going up often quite quickly, but after the high there is no where else to go but down. Down, down and down.

Now you'd think in this day and age we'd know when not to mess with a good thing, but the horrible little side-effect of capitalism is that nothing can ever remain static. Ever. If it ain't growing, then it must be dying. Or at least not living up to its full revenue generating potential; often the same thing in business school.

And so, all of our favorite software packages 'arc'. Reaching their highs in some early release, only to be followed by a long slow steady decline into the abyss. If you've hung around long enough, you know this to be true. What goes up; as they like to say...

Back to my career in creative terminology.

You should always honor those that paved the way, it is only fitting. As we take in consideration the above issue about the lifespan of software we need to fall back a bit and consider the user's perspective. Each of us who have aged a bit, has had the misfortune of being bitten by some new dis-functionality in the software. While that may be interesting -- often depending on how pressed we are for time -- our reaction to finding this out is less than amiable. Like automotive drivers completely losing it on a busy highway, I've seen many instances of software users succumbing to "Word Rage" in my day. When the user shakes their fist at the screen, curses the maker of the product and vows to set forth as much negative karma as possible on those responsible for the atrocity, you know you are seeing a genuine instance.

Word Rage. Hits when that #$%$% auto-formating stuff re-arranges a perfectly good document. It hits when you try to turn off those brain-dead auto typing crapifiers. When the menus are truncated. When that hyperlink nonsense doesn't work. Any of it. When the backup copies are no longer saved, or the save files are no longer readable. When the documentation tells you nothing, and there are fifty million stupid meaningless options that are choking up the dialog; obscuring that one stupid thing that needs to be unset; again. Whenever that program carelessly wastes hours and hours of your time, while you were only trying to do something both obvious and practical that any decent minded programmer should have though necessary. Word Rage!

But alas, while I coined the term after the masters of inciting it, many many other software packages produce equally dramatic effects. Just recently, I found that the plugin I used in a well know photo manipulation program wouldn't work with the built-in batch capabilities. I couldn't create an action for changing either the height or width that depended on the orientation and that the plugin I used for saving wouldn't work in batch either. Of the four simple things I needed to automate, three of them concealed bugs, or at very least serious weaknesses. The coders allowed the necessary capabilities to be divested into the plugin facility, but failed to integrate that into the batch capabilities; making batch useless. At least we get nice spiffy plugins; that are only partly useless. Way to go guys!

I could spend all day ranting about this sort of stuff; there is enough for perhaps years and years worth of blogging. I just doubt anyone would read it; it gets old, fast.

Our software is full of inconsistencies, bad ideas, stupid mistakes and poorly thought out designs that make it all rather irritating to try and get anything done. I find I don't have any Word Rage if and only if I'm not actually trying to get anything accomplished. The stupid machines work fine, just so long as you don't touch em. In them good ole days we didn't have a lot of functionality, but at least if it existed you could probably trust it. Now we have the world, but we waste more time fiddling with silly problems. The tools often bite their masters.

Lately I've been swearing off technology; insisting that I'm going to switch to woodworking instead. I need to rely on something that works properly. When you get to the point were you think being a Luddite would actually help make "better" tools, you know you've been enraged once too often.

Wednesday, November 21, 2007

Mind the Gap

Sometimes to get a clearer perspective on a problem we need to leave its proximity and venture a little deeper into the woods.

For this particular post, we need to go way out into the forest before we can come back and make sense of what we are seeing. With a little patience and time, I think you'll find that the rewards from exploring are well worth the stumbling around in the bushes and trees.

As an abstraction, pure mathematics is nothing short of absolute perfection. It exists in its own idealistic dimension with no interference from our world in any way.

As such, arithmetic for example is always consistent no matter where you are in the universe. It never changes, it is never wrong, it cannot be reduced to anything lower and it cannot be refactored into anything simpler. Quite an accomplishment and a model of perfection. The same is true of many other branches of pure mathematics.

As more of the real world gets involved in the equations -- such as physics -- the details become messier, but with respect to what they subscribe to, it is still close to perfect. The details cause complexity, but it is inherent in the problem domain, and thus acceptable.

Going out even further to some mathematically based soft science such as Economics we see even more real world detail, but also another type of complexity that is derived directly from humanity. I'd call it "errors", but that's not really fair. It is better described as the "not perfectness" that is driven by human personalities, history, interaction and thinking. It is working knowledge that is full of artificial complexity; where some reduction always exists that could simplify it. Unlike inherent complexity, it is not necessary.

So, we have this point of perfection that we often drift way from because of the complexity of the underlying details and also because of our own introduced complexity. These points are all important.

There is nothing we can do about real world complexity. It is as complex as it needs to be in order to function. That complexity comes directly from the problem domain and is unassailable.

The complexity we have created ourselves, however, is more interesting. Underneath it lies some minimal real world complexity, but more because of how we got there the complexity we are dealing with is significantly larger. This leads to a "gap" between the perfect solution and the one that we currently have. It is an important gap, one that bears more digging and understanding.

Looking at software, Computer Science revolves around building tools to manipulate data. In both our understanding of the data and our understanding of the tools themselves we have significant room for introducing problems.

With a bit of practice, it is not hard to imagine the perfect tool manipulating the perfect data representation. We are always driven to complete tasks, so on a task by task basis we can close our eyes and picture the shortest path to getting the task finished. We can compare this perfect version of what we are trying to accomplish to the tools and data representations that we actually have intuitively measuring the gap between the two. And it is a surprisingly large gap.

I am often very aware of just how big of a gap. I was lucky to work on some very complex but extremely well written systems, so I have a good sense of how close we can really come to a perfection solution. I can guess at the size of the minimum gap, and while some gap must always exist the minimum can be way smaller than most people realize. There is a huge latitude for our software to improve, without actually changing any of the obvious functionality.

We don't need fancy new features, we just things that work properly.

Over time the trend that I have seen is for our software to get larger and more sophisticated. It is clearing improving, but at the same time however, the gap has also been steadily growing; wiping away many of the improvements.

Given our current level of technological sophistication, today's software should be far better than it actually is. We have the foundations now to do some amazing things but we don't have the understanding of the process to implement them. We have a huge gap. Our software is much farther way from its potential than it was twenty years ago. A trend that is only getting worse with time.

The size and growth rate of the gap are important. It not only gives us an indication of how well we have done with our science, but also what our future directions will be. Given that the gap is growing faster than ever, my virus-laden PC that is choked by bad software and malicious attempts at crime and profit comes as a real embarrassment to the software industry. In the past we have been lucky and the expectation of the users has diminished, but one should really brace themselves for a backlash. We can only make so many excuses for the poor behavior of our systems before people get wise.

For instance, a virus obviously isn't necessary to make a computer function properly. As such, it is clearly unnecessary complexity that comes from the wrong side of the gap. If things had been written better, creating a virus would be impossible. The gap widened to allow spyware and viruses to became reality; then it widened again to make anti-software a necessity. Now we consume mass amounts of human effort on problems that shouldn't exist.

Computers are deterministic machines. Specifically what I mean is that they behave in completely predictable ways and with some digging all of the visible actions of the machine should be explainable. It is surprising then to encounter experienced professionals who tell stories of their machines "just acting weird". If it were one or two people we might write it off as a lack of training or understanding, but who amongst us has not experienced at least one recent unexplained behavior with their PC?

More to the point, for professionals working for years, hasn't this been getting more and more common? All of these instabilities come from our side of the gap. Being deterministic, our computer systems could be entirely built to be explainable. They should just function. Weird behavior is just another aspect of the gap.

The gap is big and getting bigger, but is it really important? When we build tools, we need our foundations to be stable so that we understand the effects of the tool. Building on a shaky foundation makes for a bad implementation. The gap at the base only gets wider as you build on top of it. Nothing you build can shrink it. If it is too wide, you can't even bridge it. The usefulness of a computer and the quality of our tools is dependent on the size of the gap.

So, standing back here in the forest, we see where we are currently with our latest and greatest technologies. Somewhere over the next hill, down into the valley and up the other side is the place where we could be if we could find a way to minimize the existing gap. We can see it from here, but we can never get there until we admit that 'here' is not where we want to be. If we stay were we are the things we build only get farther away from our potential. We amuse ourselves with clever bits of visual trickery, but progress doesn't mean more dancing baloney, it really means closing the gap. Something we should do as soon as possible.

Wednesday, November 14, 2007

Acts of Madness

Just a quick posting:

What kinda madness would convince people to base Open Source projects on proprietary technology, particularly when the technology belongs to a company best-known for riping off other people's technologies?

How much pain do people self-inflict because they fail to learn lessons from history?

Apropos for this: "if you lie down with dogs, you get up with fleas".

Wednesday, November 7, 2007

Is Open Source hurting Consulting?

In an earlier blog entry entitled Is Open Source hurting Computer Science? I looked into whether the glut of Open Source code was causing problems for software developers. In the comments, Alastair Revell correctly pointed out that my arguments were biased towards shrink-wrapped software. While that represents a significant chunk of the software market, it is by no means the majority of development.

As I was pondering how to respond, I realized that the circumstances for in-house development and consulting were far more complex than for packaged software. It was a dense enough subject to make for a good solid blog entry; a chance that shouldn't be wasted.

We can start by noting that there is a trend away from in-house development. That's not a surprise. For decades the software industry has been selling companies on developing their own custom solutions, only to have the majority of the projects fail; this has left many organizations weary when it comes to development. It is expensive and prone to failure.

I believe that it was Frederick P. Brooks who predicted in the Mythical Man Month that shrink-wrapped software would replace in-house development, arguing that it would bring the costs down. While he was right about moving away from internal projects, it does seem as if consulting has been the big winner, not commercial packages.

This shouldn't come as a surprise; most companies still feel that a unique process is a competitive advantage, and since software implements that process it too must be unique. The extra costs are worth it.

With the diminishing of in-house development, the key focus for this entry is on how significant volumes of free Open Source software are effecting the consulting industry. For that we need to first go back in time a bit.

Consulting has seen significant changes in the last couple of decades. At the end of the dot-com boom, staff was so hard to find that many programmers cashed in by becoming independent consultants. Staffing big projects was extremely difficult, so the rates were incredibly high. With the bust, the jobs disappeared and left a lot of people scrambling for work, often forced to finding it outside of software. In the turmoil, most of the independents and small companies disappeared; leaving an industry dominated only by very large companies.

It is at this point that we need to examine the rise of Open Source code. In a market of only big companies, what is the effect of lots of free code?

For a typical consulting project, access to large amounts of code would appear to be a good thing. Projects take forever, so it vital to get a quick version of the system up and running. This calms the fears and makes it easy to extend the project. A glut of code means that it is far easier to whack together a fast prototype and extend it to meet the requirements.

Quality problems with the code you might assume would be bad, but in fact extending the project beyond its original tenure is the life blood of consulting. An endless need for small fixes is the perfect situation. A smart lawyer and a polite client manager are all that is needed to turn a short-term implementation contract in to a long-term cash cow, milking it for all it is worth. Low quality means more work.

All this might be fine, if it were that individuals could startup consulting companies and gradually morph them into full fledge software companies. Fresh blood entering into the market helps keep the prices under control and drives innovation.

With all of the freely available code you'd think it should be easy for a small consultant to compete? However, the industry contains many significant obstacles.

Everybody wants the latest technology, a target that is changing too fast for most programmers to keep pace with. As well, many of the newer technologies are so eccentric that you need to have experts that specialize only in them in order to get them to work. A basic development project may cover ten or twenty such technologies; too much for any one person.

This means that the glut of code lends itself to being assembled by groups of developers not individuals. Given the basic instabilities of the software, projects made up entirely of independent consultants can be quite volatile. They need significant and patient management to keep them working together.

Past experience with software management drives many firms away from that, choosing instead, the big consulting companies that can design, manage, implement and often operate their projects for them. Of course big firms always have a preference for big "suppliers", making it hard for the little guy to get an opportunity. Little firms go for shrink-wrapped systems.

You'd think companies would have learned, after all many of them are stuck with these new, fancy, sexy but highly defective systems. These have been added to their roster along side of the fragile client/server systems of the nineties, and their ever reliable, but slow and hugely expensive mainframes; the real workhorses of most corporate IT. A trifecta of defective technologies.

CIOs in big companies are drawn toward the big consulting firms for their big new shiny projects. These consulting companies promise quick development and fast turn-around, but sadly the working problems linger on forever. Dragging out the contract is standard. It is just "another" period in the software industry where the big money users -- usually large corporations -- are getting shafted. Another of many that have already occurred.

With so much money to be made, consulting firms no doubt love Open Source. It and any quality problems it has only help the bottom line. Bugs are profitable for those that make money fixing them.

It would be no surprise then that consulting companies don't want to change the circumstances. Innovation does not help. It is good right now. With no new competition, this is unlikely to change until companies wise up.

In an off way this could also explain the increasing popularity of the newer lightweight methodologies. If you want fast results and low quality, then you certainly don't want a whole lot of process slowing things down and accidentally making it better do you? A methodology designed around slapping together quick and dirty prototypes with the least amount of planning fits well into the model of "built it quick and fix it forever". It is a scary thought.

In my previous blog entry I had reached a point where I felt that the glut of code was killing innovation and making it hard for small companies to get into the market. In this entry, I feel that the same glut also makes it hard for individual consultants and small consulting companies to get into the market, thus killing any competition or innovation. Also, related or not, in-house development is dying; it is gradually being replaced by external projects. As things are profitable right now, innovation is an unnecessary obstacle. Not unsurprisingly this implies that too much free code is causing significant problems in all corners of the the software industry. Too much of a good thing is never a good thing, it seems.

Monday, November 5, 2007

The Art of Encapsulation

In all software projects we quickly find ourselves awash in an endless sea of annoying little details.

Complexity run amok is the leading cause of project drownings; controlling it is vital. If you become swamped, it usually gets worse before you can manage to get it under control. It starts as a downwards spiral; picking up speed as it grows. If it is not explicitly brought under control, it swallows the entire project.

The strongest concept we have to control complexity is "encapsulation". The strength behind this idea is the ability to isolate chunks of complexity and essentially make them disappear. Well, to be fair they are not really gone, just buried in a black box somewhere; safely encapsulated away from everything else. In this way we can scale down the problem from being a monstrous wave about to capsize us, into a manageable series of smaller waves that can be easily surmounted.

Most of us have been up against our own personal threshold for keeping track of the project details, beyond which things become to complex to understand. We known that increasing the size of the team can push up the ability to handle bigger problems, but it also increases the amount of overhead. Big teams require bigger management, with each new member becoming increasingly less effective. Without a reasonable means to partition the overall complexity, the management cross-talk for any significant sized project would consume the majority of the resources. Progress quickly sinks below the waves. Just adding more bodies is not the answer, we need something stronger.

Something that is really encapsulated, hides the details from the outside world. By definition, it has become simpler. All of the little details are on the inside, and none of them need to be known on the outside to be effective. In this way, that part of the problem has been solved and is really to go. You are free to concentrate on the other unresolved issues, hopefully removing them one-by-one until the project comes down to just a set of controlled work that can be completed. At each stage, the complexity needs to be found, dealt with and contained.

The complexities of design, implementation and operation are very different from each other. I've seen a few large projects that have managed to contain the implementation complexity, only to lose it all by allowing the operational complexity to get out of hand. Each part in the process requires its own understanding to control its own special complexity. Each part has its own way of encapsulating the details.

Successful projects get there because at every step, the effort and details are well understood. It is not that they are simpler, just that they are compartmentalized enough that changes do not cause unexpected complexity growth. If you are, for example adding some management capabilities to an existing system, the parts should be encapsulated enough that the new code does not cause a significant number of problems with the original code, but it should also be tied closely enough to it too leverage its capabilities and its overall consistency. Extending the system should be minimal work that builds on the existing code base. This is actually easy to do if the components and layers in the system are properly encapsulated; it is a guaranteed side effect of a good architecture. It is also less work and higher quality.

Given that, it is distressing how often you see architectures, libraries, frameworks and SDKs that should encapsulate the details, but they do not. There is some cultural aspect to Computer Science where we feel we cannot take away the choices of other programmers. As such, people often build development tools to encapsulate some programming aspects, but then they leave lots of the details free to percolate upwards; violating the whole encapsulation notion. Badly done encapsulation is not encapsulation. If you can see the details, then you haven't hidden them, have you?

The best underlying libraries and tools are ones that provide a simplified abstraction over some complicated domain, hiding all of the underlying details as they go. If we wanted to get down and dirty we wouldn't use the library, we would do it directly. If we choose to use the library, we don't want to have to understand it and the underlying details too. That's twice as much work, when in the end, we don't care.

If you are going to encapsulate something, then you really should encapsulate it. For every stupid little detail that you let escape, the programmers above you should be complaining heavily. If you hide some of the details, but the people above still need to understand them in order to call things properly, then it wasn't done correctly. We cannot build on an infrastructure that is unstable because we haven't learned how it works. If we have to understand it to use it, it is always faster in the long run just to do it ourselves. So ultimately building on a bad foundation is just wasting our time.

It is possible to properly encapsulate the details. It is something we need to do if we want to build better software. There is no need to watch your projects slip below the churning seas. It is avoidable.