The Programmer's Paradox: The Generation of Complexity

"Everything should be made as simple as possible, but not simpler." -- Albert Einstein

When I was younger, I had a tendency to view history as being a smooth transition between the major events of mankind. Things progressed, we evolved and gradually we got to the point we are at today, with our knowledge and our technologies. All in a nice neat, smooth line.

However, as you delve deeper and deeper into the past, you start to realize that the chaos of modern times has always been the way things have worked with people. Far from being a clean and orderly transition, history is a boiling cauldron of great leaps forward, and horrific leaps back. Societies live and die, often they come crashing down from their great heights, or slowly sink into a stupor. Knowledge is found, bent and then forgotten along the way. It is a twisted path, one that is even deliberately erased or altered from time to time, making it that much harder to understand.

It makes sense, after all, since history is all about the interactions of people with each other and with the elements of nature around them. The disorder we see in the world today is the disorder that has always been there. We're no different from people two thousand years ago, other than we have an expanded set of knowledge, and we have access to far more technological gadgets. The same intellect and curiosity have existed in our species for a long time now.

Far from being 'old news' history is the driving force behind everything we know, and why and how we know it. Alternatives in people, places and times would have left us with similar, but significantly altered understandings.

All of our cities, states, and countries owe their names and definitions to the turmoil of history. Our sciences and engineering were all driven by personalities and events, great and small. Progress, or the lack of it, can be traced to the periods in time when great people swayed their populations, their societies or their contemporaries into achievement or silence.

And our fundamentals, those core pieces of knowledge on which all else rests, are most often named for the people or organizations that created, inspired or dominated them. Even our technological weaknesses bare strongly on the underlying processes and conventions in our world.

That is, everything we know, in any discipline, comes to us from an arbitrary series of events. Someone discovers something. Other people extend it. Success or failures occur, and we change our underlying assumptions. All and all, it is driven first and foremost by the character and personalities of those initially involved, and shaped, time and time again, by those who stand on their shoulders.

A completely different group of humans might still reach the same conclusions, but ultimately the different names and personalities would leave an indelible mark on their works, and thus our knowledge.

Another factor that is equally as important is that we can only build on what we understand. Great leaps happen, but they are few and far between. In the meantime, the rest of humanity absorbs the knowledge, gradually bending it to make it of practical value. Still, each and every fragment of understanding we have depends heavily on a multitude of proceeding fragments, and the events and personalities along that way that helped shape them.

Had Albert Einstein been around two thousand years ago, he would not have been able to conceive of the Special Theory of Relativity precisely because all of the groundwork for his understanding had not yet been laid.

Most people know and except that, but few seem to really understand how that places a layer of arbitrariness over our intellectual endeavors, our processes, our institutions, our knowledge, and our lives. Had the people, personalities or organizations been different, then the history that we know and are familiar with would have also been radically different.

THE SUM OF ITS HISTORY

One of the greatest pleasures of building software is that it allows the developers a rare chance to dig into other people's problems. That is, we build tools for people to use, and to do so well, we must not only understand the tool, we must also understand what they are trying to do with it. We must understand their domain.

To get this knowledge, we need to dig deeply into the roots of their problems, characterize their efforts and then find useful ways to map that back to some automated, or semi-automated computer software. Without this prerequisite knowledge, it is unlikely that the tools will help significantly, and often they can actually become impediments, making the problems worse.

Some developers make the mistake of looking down from their perches, staying far away from their users. Generally, the result of this is overly simplified, or even highly complex solutions that do not fit well to the actual problems. Tools that are awkward or painful to use.

We've known for a long time that the best, most useful tools have been driven by developers in close proximity to their users. To really help, one has to really understand and to really understand, one has to have a deep although not necessarily complete knowledge of what they are trying to automate.

It is in this relentless digging that we do, those software developers are often exposed to the real ugly underbelly of their target domains. That is, we have to see the arbitrary messiness of it all, and then try to lay some structure on top in order to bring the problem down into something that is manageable, both by the computer but also by the development team itself.

Software is notorious for frequently changing, but those changes are more often the result of mistakes in understanding by the original developers then they are shifts or changes in the underlying domain. That is, most of what is wrong about the software comes from a failure of the designers and programmers to really understand the domain. Or even sometimes, to understand what aspects of the domain are inherently flexible.

Most domains have been around and established for a long time, they have settled nicely into a set of practices and conventions. To the domain experts, who specialized in these branches of knowledge there is a certain consistency and structure to their work. To outside observers, such as developers with limited exposure, things may seem a little more haphazard. And that is where history comes crashing back into the software development process.

That is, the underlying complexity of a domain is built up gradually through history as a result of the personalities and organizations that have come and gone from the domain. History is the driving force behind the underlying complexity in the data, the process and all other aspects that define the domain.

TYPES OF COMPLEXITY

Complexity is a complicated beast. It is not so much a thing as it is the difference between two related things. That is, if you take two similar things, one simple, and one not, then the difference is pure complexity.

That is a relative definition of complexity; we could try for an absolute one, but for any metric we assign, numeric or otherwise, the meaning would essentially be arbitrary. If we choose some number X, then the difference between 15 and 245 in this complexity number doesn't provide any useful understanding if we can't relate these numbers back to something tangible. And in that relationship, we might as well just stick with some relative difference, since 230 as a number is just as meaningless unless we understand the original two things we are comparing.

So when we think of complexity, we really need to think of how it can range from simple to massive in order to fully grasp what it means. It is how it changes that is important. Of course, simple itself is not an easy concept either as I pondered in:

http://theprogrammersparadox.blogspot.com/2007/12/nature-of-simple.html

Still, even with only a weak relative understanding of complexity we can go forth and examine it's two main underlying causes. They are:

- Strange Loops
- Volume

Complexity comes from either the inherent complexity embedded in some underlying idea, or from the sheer volume of simple stuff that is stacked together.

Strange loops as explained by Douglas Hofstadter in GEB are hard to understand, non-intuitive concepts like recursion, infinity, self-reference, etc. We see plenty of examples ranging in their 'hardness'. For example, some people find mathematics generally confusing. Certainly, most people find the bending of time and space to be confusing if not the way things work with quantum mechanics. Chaos theory, fractals, etc. are all complex concepts that take some effort to be able to understand. However, once groked we can then look back and see them as simple. But for the new and uninitiated they can be huge mountains to climb.

Volume speaks for itself. If you have some mass of information, not terribly hard, it still takes a significant effort to work through it all, and remember it. Complexity need not be difficult, it can just be related to size. The sum total of all of the civil law, as rules and regulations for a region, might, for example, be a massive volume of cases and histories all intertwining within themselves. Generally, the underlying cases are just about facts and events, finished with some sort of overall ruling, but still, the legal world can be a complex and painful labyrinth to navigate. What is mostly simple in pieces, can quickly build up to be overwhelming.

And of course, anyone familiar with working in a large bureaucracy would also know the pain of trying to get anything to change. A vast mountain of meaningless process and rules built up over an extended period can lead to a nearly impossibly rigid and static mess.

We see examples of people caught in David and Goliath battles with bureaucracies all of the time. Observers are often left wondering how they could have ever gotten built up so badly. But once you've become a cog in the machine, the tar pit makes considerably more sense, the longer you hang around.

And, mostly in these weighty organizations, it is how the different personalities and politics play out that keep things from progressing; that keep them from changing. Those that complain the loudest, are often the hardest to budge in their own little corners.

Volume-related complexity gets into the system for a couple of reasons:

- Intrinsic complexity.
- Accumulated through a long history.
- People making the problems more complex than necessary.

Some things have an inherent underlying complexity to them. Fractals are a great example in that at each level the patterns may appear simple, but the self-similarity that spans all of the levels is inherently complex. It took a long time for people to start to understand them, and it will take a long time before they have been fully integrated into our overall knowledge base.

I've already talked a lot about how all things are a sum of their histories, and how even small differences in the histories may account for large shifts in knowledge. Most complexity comes in from a long process of getting built up over time. Each piece may be simple in its own right, but the sheer scale and volume getting built up can quickly become unmanageable.

As well, People often have an inherent desire or need to over-complicate things. Somehow it seems to manifest as insecurity about appearing smart and as a result, many people overdo the required effort to make themselves look or feel better.

Strangely, their results rarely fool other people, but the consequences usually last essentially forever. Once some complexity has become enshrined into a system, there is very little that can be done to remove, refactor, or replace it.

EXAMPLES OF COMPLEXITY

There are so many great examples of staggering complexity in our modern world that is hard to know where to begin.

We find it easily in our modern lives. Our properties and possessions require effort. The first-world owns more stuff per person, on average than at any other time in history (I am guessing). Everything we own comes with some effort to learn how to utilize it, an expectation for time and some amount of on-going maintenance. The more we own, the more effort we accrue. A massive pile of stuff requires a massive amount of time. Either that or is just collecting dust in our basements.

For the middle class and above, our lives and our professions drive us to more and more interactions with various different types of professionals. From simple health-care specialists, financial help and advisers, property repair, contracting, career-related support to purchasing both short and long term goods we get a myriad of advice from an ongoing collection of professionals.

So much advice, so often, that it exceeds our abilities to follow it all, correctly. It even exceeds our abilities to remember the bulk of it. It's just an ever growing list of things that we should have done, from which we can only pick a small subset to complete because that's how much time and effort we have left at the end of the day. We all fail to live up to our complete expectations, it has become the norm.

In the world around us, it is commonly understood that "ignorance of the law is no excuse", which is to say that our societies have a strong expectation that we will all know and obey all of our laws. However, most, if not all, modern societies have been adding new legalize to their books for so long, and in such volumes that is impossible for any single human to know all of the laws that apply to them.

That is, while we might have some general vague notion, it is unlikely that even the professionals can quote every major law in both the criminal and civil codes. There are simply too many rules. Still, we are held responsible for things we couldn't possibly know. Clearly a defect in our structure.

Earlier I talked about bureaucracies. They are mammoth organizations that have become so plagued by their own processes and rules that they are unable to break away from the status quo at anything other than a crawl. The length of their history only strengthens their problems. Bureaucracies are disasters by definition. In the past, when they were bad but smaller, ignoring them might have been an acceptable option, but now as they threaten the sustainability of our societies we are quickly reaching a point where an action is necessary. Where we cannot allow them to continue on in their broken fashion.

Even our own languages and the way we communicate with them can be affected by complexity. Who hasn't read examples of writing that tries too hard to impress by littering the text will long sentences and complex terms. Truly convoluted writing is a masterpiece in complexity.

There are a huge number of examples of this type of banter, many coming from deep within the academic circles. It is not a surprise given that most academic settings are harsh and highly competitive environments. A lot of people want to be the smartest people around, and many of them are willing to do anything to get those honors.

A related example is how restaurant dishwashers often joke about being "ceramic maintenance engineers". A wonderfully obtuse term whose example is sadly followed in many serious branches of sciences.

Of course, it is not just our lives and the way we organize ourselves that is affected. All of our knowledge from the soft sciences, right down to mathematics itself is plagued with overcomplexity.

Huge structures like bridges can be overly complexity, built to withstand unrealistic challenges. Still, this is one of the few areas were overly cautious and overly complex works are not necessarily bad things, in that, we don't know how long things should last for. An over-engineered bridge, while costly, is far better than an under-engineered one. Perhaps that why we have so many? Of course, when we build them but don't need them, they start to impact whether or no our societies are sustainable. Big projects require big maintenance, which requires big money.

Some of our softer sciences are more convoluted than they are real. It is easy to fake a "scientific" method, and then start writing up technical, complex mumbo jumbo that literally means nothing, or draws invalid conclusions. The media is awash with questionable "scientific results" constantly because their findings are usually shocking, which makes for a good news story (regardless of whether or not it is true). Readers want "exciting" and the media is always willing to provide that.

In some cases, such as economics, there is some real strength to the underlying theories, but in practice, a rosy prediction of the future is far more likely to impress, than a truthful one. The actual practice of science is bent towards more short-term objectives, like making a living.

We've seen this often as well in the financial sector, where most recently billions were made by selling complex financial instruments (CDS), that were based on ridiculously bad mathematical foundations. Stuff that was obviously wrong, but people willingly put faith in the embedded complexity, assuming that it was correct because they didn't understand it and it made them money.

A couple of very surprising examples of complexity come from unexpected places.

My view of mathematics was as branches of pure untouchable abstractions, theories, and formulas that can get applied to help us understand the world around us. Still, I've dipped into a couple of these branches where the resulting work is exceedingly obfuscated, to the point was it seems intended to obscure the underlying ideas. The work seems like exercises in excessive symbol manipulation.

Like everyone else, there are enough mathematicians of average ability out there that really want to be seen as standing above their peers, so they resort to trying to spice up their works a bit.

Mathematics consists of many formal systems that are analogous to computer software programs in such a way that the authors of mathematical papers can produce spaghetti definitions, spaghetti theorems, and spaghetti formulas in the same way that a programmer can produce spaghetti code. That is, they can create things that are orders of magnitude more convoluted and complex than is truly necessary. Calling something spaghetti code doesn't mean that it won't work properly, but it does mean that understanding it is hard and changing it is likely fraught with difficulties.

Nature only recently (midway through last century) revealed the structure and shape of its complexities to us in the forms of fractals and chaos theory. We can see the self-similarity of the underlying bits reflect themselves over and over again as we go in and out of the detail of many simple things like trees, mountains, clouds, and forests. In chaos theory, we've come to understand how small changes, even in simple formulas, can have profoundly large effects, and also how things can have seemingly random paths that still appear to orbit about certain regions.

Both are manufactured world, and the natural one in which we live are rife with limitless examples of complexity. Even the natural progression of our universe, that is 'entropy' is bound to gradually reduce any order into chaos. Complexity is the state to which all things return, to which we have to fight the hardest to prevent.

THE LIMITS OF COMPLEXITY

Some people are incredibly smart, but even the very smartest of them are not massively more intelligent than the average person. That is, although it is hard to quantify and measure, we won't find someone 10x more intelligent than average, and it is unlikely to even find someone "twice" as smart. Our brains have fundamental limitations beyond which we cannot go. There is only so much we can hold, understand and respond to at a given time. Some people may have moments of greatness, but that is balanced by the rest of their time. No doubt, the smartest man or woman on the planet right now has the occasional off day; days were their level of functioning is well below average.

And certainly, we can see from experience that even a group of smart people, collectively can be working together at a fairly low level of intelligence.

Committees resist intellectual qualities because they essentially normalize their output down to the lowest common denominator. Groups of people do not easily raise the barrier of intelligence. Intelligence just doesn't scale, we don't get 10x more intelligent behavior from a group of ten super-smart people. And depending on the state of their interaction, we might not even get 1X more intelligent behavior from a particularly discontent or dysfunctional group of smart people.

What that means is that for every person, there is some level of complexity in front of which they can operate, but beyond which they start to fall apart. Things start to happen, unexpected ones. They cannot cope.

We all have this threshold, over which we cross and our abilities become compromised.

And sadly, or strangely, for most people, their individual thresholds are not nearly as far apart from each other as most people want to believe. There are differences between people's abilities, of course, based on environment or personality, or intellectual capabilities or even just pure memory retention, but few people really transcend their own origins in the way that they've convinced themselves that they have.

We're the masters of making our own myths. Of believing that we've somehow risen above our nature and can now proceed, consistently, at some higher level of behavior and thought. Crashing down from those lofty heights is a common pain, felt by most.

Having well-defined intrinsic limits on the overall complexity we can handle, has forced us to search out newer and stronger methods of mitigating our eventual problems with control.

CONTAINMENT

It helps us so little to understand the nature and form of complexity if we are just going to accept it as is. Complexity builds up and becomes increasingly dangerous as it does so. Things start out OK, but gradually over time, they get worse and worse. Because of that, the next really big leap in our modern age won't be new branches of science or even more effective engineering, but it will be developing an understanding of complexity and being able to systematically reduce it in whatever guise it is hiding.

A significant cause of project failure in software comes from projects where the developers let the complexity run out of control. This is something that most veteran programmers have experienced at least once, if not many times.

It happens either because the project is changing too often, or the scope is gradually getting larger and larger. Either way, the differences and fixing them start to quickly negate any real effort in the works. The project spins its wheels endlessly, at full speed, yet gets nowhere.

Often this is called a death march.

Software developers have a long and sordid history of losing their works to these types of organizational disasters. Once the downward trend starts, it can be difficult or impossible to reverse, and the project is essentially doomed.

But it is in the exploration of these types of issues, that the software community leads many other domains.

We do, after all, acknowledge our short and long-term choices in terms of 'technical debt'. We know to control complexity (even if we don't) and we know the importance of going back over our work and refactoring it, or just doing necessary cleanup.

Different movements in programming have been arguing about the right approaches for decades now, but at least the argument is occurring. You don't see massive bureaucracies seriously try to control or bring down their internal complexity often. There is essentially no refactoring happening in either the scientific community or in our massive organizational structures.

Since it is so much easier to add complexity, little gets done to relieve it.

Still, like an out of control software project, so many of our intuitions are plagued with serious complexity. Each and every time I've dug into a specific domain, I been surrounded with staggeringly bad examples of out of control complexity. Each and every time the domain experts have dismissed it as just being the "way it is"; that it will never change; that you have to live with it.

Admitting to complexity and controlling it has been key to getting many large projects into the successful category in my career. Ultimately if you know the real source of the problems, dealing with them effectively isn't terribly hard. Still, possibly because the history of software is so short, and has been burdened by so many public failures, many software developers can acknowledge their problems and move on. Most other domains aren't so lucky.

Making even a small change to some fundamental issue in mathematics, for instance, even if it did help tremendously, is likely something that would take generations to accomplish. Mathematics being perceived as being more rigid than all other disciplines is also the one that will easily take the longest to change, even if necessary.

Bureaucracies don't change, almost by definition, and in all ways, every aspect of our modern lifestyles just get more and more complex. We only have ways to increase the complexity, not to study it, and definitely not to reduce it correctly.

THE STRUCTURE OF KNOWLEDGE

If you're being honest and objective about it, what we know, the sum total of our knowledge as a species is a total mess. It is a ragtag collection of bits and pieces, sometimes stitched together neatly, but mostly just dumped out in batches and clumps, over a long period of time. It includes real universally true knowledge, myths, fallacies, relative truths and a huge collection of various unknown bits. Sections of what we know are tidier than average, but the sum total of all of it is a mess.

What we need is both a way to put a structure over what we understand and ways to systematically reduce what we know into some simpler, more accurate clean form.

We do have some quantitative ways of re-arranging knowledge, mostly akin to refactoring.

Simplification, normalization, and optimization are three ways of re-arranging the underlying information to change its properties. Abstraction is a way of just taking out the essence and ignoring the rest. Encapsulation is a way of hiding the underlying detail from a higher level, while still being able to make significant use out of it. All of these can be applied to any structure of knowledge. In some cases, it may be slow and tedious, but it is certainly possible.

Knowledge comes in levels. That is, at some higher understanding we can see a 10,000-foot view of the problem, but to understand more, we need to get lower and lower into the details. At the detail level, it can be very difficult to understand the significance of things to the overall context. To control and understand it, we need to make it consistent and workable on all of its different levels. That is, a great higher level abstraction of a branch of science is not all that useful if the details are still a huge mess. The structure and consistency at each level needs to be harmonized.

Knowledge can be spaghetti. That is, it can be artificially complex in any number of degrees so that it hides the real underlying core. Any text, discipline, idea or thought can be obfuscated to the point where it is super difficult, if not impossible to discern the original. Of course, we know we can re-arrange the knowledge and drop the complexity. In that sense, knowledge is "code" of some type. Our natural languages form a type of programming languages on which we describe the structure of what we know, and the process of using that knowledge in the real world. The sum total of our knowledge as contained structurally is just another system, less formal, but not unlike computers or branches of mathematics.

Most importantly, subsystems of knowledge have isomorphisms to other subsystems. That is, we can map one type of knowledge onto another, and then draw advanced properties and meta-knowledge from both. In this way if we have two completing branches of mathematics, or two competing sciences or even two completing definitions for bureaucracy, we can apply metrics to them to conclusively show that one is simpler with respect to some attributes, then the other. We can decide which one is better suited for our use and move down to that version.

In theory, at least, we should be able to redraft a simpler legal system for example, that contains most if not all, of the same depth as the current one, yet whose definition is only a fraction of the same size. Basically, we could boil down all our laws into a much smaller, more workable subset. The same is true in most organizations. With some investigative work, we should be able to simplify things enormously, with little consequence.

SOME FINAL THOUGHTS

Still, although most people have an inkling that this is possible, in each organization, in each discipline, and in each endeavor history is filled with failed attempts that ultimately have only made matters worse. And it is these, that the pessimists guarding the gates will use as examples of "why it is not possible". The gatekeepers generally believe that they are protecting things, but most often they enshrine the underlying madness. If it can't change, it can't get worse, they say. But it also can't get better.

To get past the natural hesitancy to keep things the same, anything we do with complexity needs to be rigorous, and provable. That is until you can show decisively that some analysis and refactoring will fix the organization, you'll never be allowed to change the organization. And it is entirely because of this, that all things in our lives, over the last couple of centuries, have simply just gotten more complex, and entirely out of control.

Either we find some new way of containing and reducing them, or we just simply let history repeat itself by crushing our society and then eventually starting a new one again from scratch in a few hundred years. Death, it seems is our only current method of effective complexity control. And not unsurprisingly, death marches are exactly what software developers work exceedingly hard to avoid in their projects. Our modern lives are well on their way towards their own destruction from imploding because of rampant over-complexity.

It is easy to guess what will eventually happen (because it always does), but really hard to find a way out of our fate. Perhaps this time, we've risen higher enough to a degree of collective intelligence such that we can avoid the fates of all other times. Perhaps this time.

The Programmer's Paradox

Sunday, February 7, 2010

The Generation of Complexity

No comments:

Post a Comment