Thursday, April 16, 2009

The End of Coding as We Know It

It's time for a bold prediction.

I know that most software developers see computer language programming as the essential element of software creation. For most people it is inconceivable that we might one day automate large chunks of this effort.

Automation, it is believed, can only come from tasks that are repetitive and require little thinking. Programming requires deep concentration, complex visualizations, and often some pretty creative thinking to get around both the technical and domain-based problems.

Thus, it is easy to make the assumption that because it is rooted in intellectual work, programming in computer languages will always be an essential part of part software development.

However, I think this is a false statement.


THE LAST OF THE TYPESETTERS

I can remember my Dad -- an editor for a printing-industry magazine -- taking me on a tour of one of the last vestiges of movable type. In its day, movable type was a technical marvel. They had invented a super-fast way of laying out printable pages.

Movable type involved the careful positioning of thousands of tiny small metal letters, called glyphs. A typesetter could quickly create page after page, and reuse the glyphs for more layouts after they were finished printing.

It was a huge step forward for printing, a major leap up from the hand carved printing blocks of the past. Printing in a matter of days.

A typesetter was responsible for hand positioning each element. Good quality workmanship was difficult, and the occupation was considered skilled. A great deal of knowledge and understanding were often required to get a high-quality readable layout, including hyphenation, kerning and leading. Typesetters, as a subset of typographers, which also includes font designers, were considered master craftsmen.

The future had looked bright for typesetters. Their highly skilled profession involved the careful layout of huge amounts of text. Printing was a major growth area; one of the great technical endeavors of the industrial revolution. It was a great job, well paying, and with a lot of esteem. It was a smart career choice, surely a job that would be required forever.

Forever didn't last long.

Over time, the technologies improved, yet right up to the 1960s, typesetters were still required. From movable type to hot press, then later to cold press, the work endured while the technologies changed around it. Still skilled, still complex, but gradually becoming less and less significant.

Finally, the growth of desktop publishing killed it. Much of the beauty and art of the layout was lost. Computers, even with good font hints, don't generate the same quality of output, however, consumers no longer appreciate the difference. High-quality typesetting became a lost art.

Typesetting still occurs in some context, but it is very different from its origins. It is a specialty now, used for very limited occasions. My dad lamented the end of typesetting, often telling me that desktop published documents were not nearly as readable or classy.


LESSONS LEARNED

So what does this have to do with computers?

It is an excellent example of how easily technology creates and removes occupations. How easily it changes things. Typesetters can be forgiven for believing that their positions would survive far longer. It was a skilled labor job, that required both a visual eye for detail, a considerable amount of intelligence, and a good knowledge of spelling and grammar for hyphenating the text. It would have been way better than laboring in a factory.

Way back then, with our current frame of reference, it would be easily to assume that the intellectual effort involved in typesetting rendered it impossible to automate. Anybody that has dealt into the murky world of fonts and layout knows that it is far messier and way more complex than most people realize. But then we know that aspects of the problem gradually got redistributed to fonts, layout programs, or just lost. The constrains of the original efforts disappeared, and people gradually accepted the reduced quality in their outputs. Automation brought mediocrity. Average.

Programming has aspects to it that require intelligence, but it also contains large amounts of rather mindless effort in the whole work. We code a big mess, then spend a lot of time finding fiddly small problems or reworking it. Intellectual work is intellectual work, no computer can ever do it, but that doesn't mean that it has to be done each and every time we build a system. That's the basis of a false assumption.

While the core of the intellectual work will never change, how it's done, how much of it really exists and whether or not we'll have to keep coding forever are all up for grabs. All we need do is restructure the way we look at programming and we can see some fairly simple ways to collapse the effort, or at very least get far more re-use out of our current efforts.

I'll start with a simple model.


CONTEXTS

Consider the idea of a 'data context'. Basically a pool of data. Each one holding some amount of data. In a context, a datum has a very specific structure and type. The overall collection of data may be complex, but it is finite and deterministic. There are a fixed number of things in a context.

Getting data from one context to another is a simple matter. The data in one context has a very specific type, while it may be quite different in another context. To go from one to the other the data must go through a finite series of transformations. Each transformation takes a list of parameters, and returns a list of modified values. Each transformation is well-defined.

We can see modern computers as being a whole series of different contexts. There is a persistent data context, an application model context, a domain model context and often a lot of temporary in between contexts, while some computation is underway or the data is being moved about. Data appears in many different contexts, and these stay persistent for various different lengths of time.

A context is simply a well-defined discrete pool of data.


EXTENDED TYPE STRUCTURES

We often talk of 'type' in the sense of programming language variables being strongly typed or loosely typed. Type, even though it may be based on a hierarchy, is generally a reference to a specific data type of a specific variable. It is usually a singular thing.

In a general context, we do use it as a broader structure based definition, such as referring to Lists, Trees and Hash Tables in abstract data structures, but most people don't classically associate an ADT like List with the 'type' of a variable. They tend to see them as 'typeless' containers.

For this discussion we need to go higher, and think more in terms of an 'extended type', a fully structural arrangement of a much larger set of variables, where the interaction isn't restricted to just hierarchies. The full structural information for an extended type includes all of the possible generalizations of the type itself, including any alternative terminology (such as language translations).

The type of any variable, then is all of information necessary to be very explicit or completely generalized in the handling of any collection of explicitly related data. The extended type information is a structure.

We can take 'type' to mean a specific node in this complex graph of inter-related type-based information. A place in a taxonomy for instance.

Type, then includes any other reasonable alternative "terms" or aliases for the underlying names of the data. For example, floating-point-number, number, value, percentage, yield, bond-yield, bond-statistic, financial-instrument-statistic, Canadian-bond-statistic or Canadian-bond-yield may all refer to the same underlying value: 4.3. Each title is just another way of mentioning the same thing, although its reference ranges from being very generalized to being very specific.

Type can also include a restricted sub-range of the fully expressible type. For example, it may only be integers between 2 and 28. Thus an integer of 123 cannot be mindlessly cast to an integer_2..28, it does not belong to that 'type', but the integer 15 does.

Data of one type can be moved effortlessly to data of any other type in the same structure, they are one in the same. Data that is not within the same type structure requires some explicit transformation to convert it.


TRANSFORMATIONS

A transformation is a small amount of manipulation required to move some data from one unique type to a different one. Consider it to be a mini-program. A very specific function, procedure or method to take data of type A, and convert it to type B.

A transformation is always doable. Any data that comes in, can be transformed to something outbound (although the results may not make sense to humans). Transformations are finite, deterministic and discrete, although they don't have to be reversible. The average value of a set of numbers for example is a non-reversible (one-way) transformation.

Transformations can have loops, possibly apply slightly differently calculations based on input, and could run for a long time. Also the output is a set of things, basically anything that has changed in some way, from the input. There are no side-effects, everything modified is passed in, everything changed is returned.

The transformation is specific, its input is a set of specific values of specific types, and its output is another set of values of specific types. No conditional processing, no invalid input, no side-effects. A transformation takes a given variable, applies some simple logic and then produces a set of resulting variables.

The underlying language for transformations could be any programming language, such as C, Java, Perl, etc. Mostly, I think most of the modern functional programming languages, such as Haskell and Erlang define their functions in same manner (although I am just guessing), but Perl is the only language that I am aware of that can return lists (as native variable) from function calls.


PUTTING IT ALL TOGETHER

The three simple concepts: contexts, types and transformations form a complete computational model for utilizing computer hardware.

We can skip over any essential proofs, if we accept that the model itself is just a way to partition an underlying Turing complete language, in the same way that Objected Oriented doesn't make anything more or less Turing complete.

I think that higher level structural decompositions do not intrinsically change the expressibility of the underlying semantics. In other words, nothing about this model constraints or changes the usability of the underlying transformation programming language, it just restructures the overall perspective. It is an architectural decomposition, not a computational one.

A path from context to context, involves a long pipeline of simple transformations. Each one takes a specific set of input, which it converts into output. To further simplify things, each transformation is actually the smallest transformation possible given the data. Each one does a near trivial change, and then returns the data. If there are conditional elements to the path, that processing takes place outside of the transformations, at a higher level. The transformations are always a simple path from one context to another.

In that way, the entire system could consist of millions and millions of transformations, some acting on general data types, others gradually getting more specific as the transformations require. Each one is well defined, and the path from context to context for each datum is also well-understood.

From a particular context, working backwards, it is an entirely straight-forward and deterministic pathway to get back to some known context starting point. That is, the computer can easily assemble the transformations required for a specific pipeline if the start and end contexts are known.

There is no limit to the number of contexts or the length of time they stay around. There could be a small number or as we often cache a lot in modern systems, there could be a very large number of smaller contexts.

We can build massive computer systems from a massive number of these transformations that help the system to move data from one context to another. It would not take a huge amount of effort -- in comparison to normal programming efforts -- to break down all of the domain specific data into a explicit data types and then map out a huge number of transformations between the different types. We do this work constantly anyways when building a big system, this just allows us the ultimate 'reuse' for our efforts.

Users of this type of system would create a context for themselves. They would fill it with all of the references to the various different bits of data they want to access, and then for each, map it back to a starting context. In a very real sense, the users can pick from a sea of data, and assemble their own screens as they see fit. A big browser, and some drag and drop capabilities would be more than enough to allow the users to create their own specific 'context' pages in the system.

We already see this type of interface with portal web applications like iGoogle, but instead of little gadgets, the users get to pick actual data from their accessible contexts. No doubt they would be able to apply a few presentation transformations to the data as well to change how it appears in their own context. Common contexts could be shared (or act as starting templates).

As an interface, it is simple and no more complicated than many of the web based apps. Other than the three core concepts, there are no unknown technologies, algorithms or other bits necessary to implement this.


RAMIFICATIONS

Software would no longer be a set of features in an application. Instead it would be millions and millions of simple transformations, which could be conveniently mixed and matched as needed.

Upgrading a computer would involve dumping in more transformations. The underlying software could be sophisticated enough to be able to performance test different pipeline variations, so you could get newer more optimized transformations over time. Bad pipeline combinations could be marked as unusable.

Big releases from different vendors or even different domains could be mixed and matched as needed. One could easily write a simple series of patching transformations to map different sets of obscure data onto each other. All of our modern vertical silo problems would go away.

Distributed programming or parallel programming are also easy in this model, since it becomes a smaller question of how the individual pipelines are synchronized. Once reasonable algorithms get developed -- since they don't change -- the overall quality will be extremely high and very dependable.

In fact the system will stabilize quickly as more and more transformations get added, quantified and set into place. Unlike modern systems the changes will get less and less significant in time, meaning the quality will intrinsically get better and better. Something we definitely don't have now.

Of course the transformations themselves are still programming. But the scope of the programming has gone from having to create hugely complex massive programs to a massive number of hugely simple small ones. The assembly is left to the computer (and indirectly to the user to pick the data).

Eventually, though, the need for new transformations would slow down, as all the major data types for all of the various different domains would get added. Gradually, creating new transformations would be rarer and rarer, although there would always be some need to create a few.

Just quickly skipping back to typesetting, it should be noted that Graphic Designers still occasionally manually tweak kerning or leading to make some graphic layouts have a high quality appearance. The job disappeared, but some vestiges of it still remain.

Of course, we will still need data analysis and operations people to handle setting up and running big systems in production, but the role of the programmer agonizing over line after line of code is not necessary in this model. The computer just assembles the code dynamically as needed.


SUMMATION

I presented these ideas to show that there is at least one simple model that could eliminate programing as we know it. Although these ideas are fairly simple, building such a system involves a great deal more complexity that I addressed.

It is entirely possible, but even if these ideas are picked up right away, don't expect to see anything commonly in production for a long time. Complex ideas generally need about twenty years -- a generation -- to find acceptance, and some ideas need to sit on the bench a lot longer before people are willing to accept them.

Even if we built the best distributed transformation pipeline systems with near perfect quality, it would still takes decades for the last of the old software to die out. People become rather attached to their old ways, even if they are shown to not work very well. Technology rusts quickly, but fades slowly, it seems.

Programming, while doomed, will be around for a while yet.

Saturday, April 4, 2009

My PC Crashed (Again)

An ode to my PC:

I hate my PC. I've had many computers over the years, and often they have found a soft place in my heart. My Apple ][+, bought used, served admirably for years. My XT lasted longer than I could have ever imagined. The super well-run BSD Unix boxes at the University of Waterloo were always a delight to use. Even my VMS workstation which could be a bit cranky, survived for five years without ever being turned off. Most of my Unix workstations went six months or more without issues or reboots. Yes, I've been lucky to be able to work with computers that actually worked for me.

I hate my PC because it's undependable. It's the hardware: cheap crappy stuff that keeps failing. I'd buy better, but I can't tell anymore what is good and what is crap. Possibly because it is all crap now. It's the software: millions upon millions of lines of slipshod, hacked junk so full of potholes that it would take several million lifetimes to patch them. The Microsoft stuff is bad, but the overall industry stuff is worse. It's an endless sea of spaghetti barfed up late in the night. It's the support: anything goes wrong, well too bad, we told you in the disclaimer that we weren't responsible. We'd like to help, we really would, but who really understands these things anymore? It's the product as a whole: the PC and all of those things that go along with it. The hardware, the software, the market, the add-ons, the environment, the culture and all of the services. The whole kitten-kaboodle. PCs started life as the hacker machine, allowing the hacker culture to build up high and mighty around them. You can do anything with these machines, except make them work consistently.

I hate my PC because they over-charged me. Research is expensive, but despite that, computers have spawned a huge number of fortunes. Fabulously rich people. All those millionaires and billionaires strutting around, raving about their successes, writing books and giving advice. That would be OK if my machine actually worked. But given that it doesn't, a fair price would have been a fraction of what I paid, or what they tricked me out of. It isn't for love or for knowledge that they work at building these machines, but for the rights to a big mansion, a fancy car and a huge boat. And to throw it all back in my face, many of them now spend their days giving away all of the 'extra' dough they collected to charities and needy causes, instead of actually making the machines work properly. We didn't get what we paid for.

I hate my PC because it is the gateway to the Internet. That once hallowed sea of massive information is now nothing more than a giant propaganda machine. A medium for cheap hustlers to make a buck. Proof that too much of humanity is irredeemable. Gone are the days of information and knowledge, replaced by cheap tabloids, marketing and gossip. Just another crowd of people hoping to cash in. It has become another form of TV, bent on keeping the masses mollified. A medium to be mastered, it's simply a question of which groups are winning in that endless race to waste your time and take your money.

I hate my PC because it is plagued with infestations. As if bugs weren't bad enough, it now has viruses, trojans and worms galore. Written by misguided kids to generate profits for organized crime or organized business, both eerily similar. One could understand the original phone phreaks and their inherent curiosity to explore technology, but it was bounded by a strong no destruction ethic. These days in the free-for-all online world, the motivations have changed. It is just mean and ugly now; for glory or for profit, it doesn't matter who gets hurt. Whatever good there might have been vanished long ago.

I hate my PC because I fear that some big corporate stooge is going to install one to maintain my back account. The big, super-expensive, slow, honking great mainframe computers that we've relieved on for decades are damn near impossible to change, yet that's probably the reason why I don't have to go rushing into my bank branch each month complaining about missing money from my account. With constant moves to take the retrograde PC operating system -- cobbled together from a long, nasty history of flat foods, micromanagement and insane deadlines -- and jam it down the collective corporate throats, there is an increasing chance that we'll become more dependent, not just on those old run-of-the-mill crappy mainframe computers, but on these newer bottom-of-the-barrel super-crappy PC ones. A bad day, waiting to get worse, for sure. I don't want this crap on my desk, so I certainly don't want it in my bank's fancy air-conditioned machine room, nor anywhere near anything that is even remotely vital to our lives. The dump is my preferred location.

I hate my PC because it is a metaphor for what has happened to our society. We have become overly complex; well over the top. And yet underneath, we are increasingly dysfunctional. We're on a steady downward tumble, selling our soles for cheap disposable bobbles. Filling our basements with dusty junk. Foolish victims of a society run amok, no longer grounded in the things that matter. We create stupid rules, and then pile them up on top of each other, so high that they collapse of their own weight. What decades ago started as a movement to fix the world, fell down to simply changing it, gradually for the worse. Now everybody wants to leave their mark, even if it is just graffiti. And we have no way of unfixing it after they're done. The status quo might not have been great, but our present technologically sophisticated, flashy, but all around fragile existence, like our cheap crappy bloated PCs is constantly just one wetware virus away from us having to hit the reboot button and lose all of our data. Again.

I hate my PC because of what it is not. It is not the tool to save the world, nor is it the answer to mankind's problems. It doesn't automate stuff or make our lives better. Instead it disconnects us from what really matters and lures us with a shiny yet false promise. Over the years, each machine I used has become more sophisticated and faster following Moore's law, yet the later ones, one after another have followed an anti-law as well. Each one gets crappier than the last; each one gets more undependable. Each one is more vulnerable to spammers, virus writers and marketers. Each one descends a little farther down that slope, giving me only the barest sense of stupid improvement by flashing some more pretty spinning 3D graphics on my screen. Each one strengthens my disappointment. Each machine, now slowly eats away at our collective sanity.

Most of all, I hate my PC because it displaces what should have been there on my desk. A machine that works, one that I trust and one that improves my world. Now, instead of making my life easier, I find some new crisis there every three to six months, be it another dead piece of hardware, a bad software bug or a full-blown virus/trojan attack. And when it's not a major crisis, it is still continually wining about some useless upgrade, or that the net is unavailable again, or it is just being dog slow. I cannot trust this thing, it simply disappoints me whenever it gets the chance. Instead of helping me make sense of the world around me, I find a sea of messy disconnected data, that is hopelessly inconsistent, and incoherent. Showing me that while there might actually be answers, today is not my day. Instead of giving me more luxury time to explore the world around me, it chains me to my chair and forces me to endless install, restall, destall or just stall my life trying to find some temporary combination of crappy software, bad utilities and irritating sites that will momentarily make some minor life improvement, before, once again sending me back into the breach to fix yet another stupid but entirely avoidable and moronic issue with these cursed machines. All I want is a real computer, one that works. One that I can trust.

I hate my PC.