Friday, February 27, 2009

97 Things Every Software Architect Should Know

It's been released! Richard Monson-Haefel’s project 97 Things Every Software Architect Should Know was a community effort put together by a large collection of software aficionados. It started as a web site, and has now been published in printed form. I think it's a great bit of work, but I'm slightly biased as I managed to get a couple of my own contributions into the effort. Beyond its great advice, the book is important for several other reasons.

The first is that it managed to reach out to the whole community for support. Other books have done similar things, for example Beautiful Architecture and Real World Haskell, but this one opened itself up to allow anybody to submit complete axioms at its web site. This sets it apart because it's not just about the people, it's about the quality of the advice itself. The axioms were picked because they worked, not because the authors were famous or connected.

Another important reason is that with such a young discipline, much of what is needed to push software development into the next generation of technologies will come from the front-lines not from the hallowed halls of academia or from the back-alleys of big corporations. It's the people out there in the trenches that know what parts of our technologies work, and what parts are dismal failures. Software development has often been driven from afar by theoreticians and ivory towers, so it's nice to see us getting real feedback from the ground-floor. If we're going to find better, more reliable ways of building complex systems, the answers are going to come from really getting to understand what we know, not just what we think we know.

It's available on-line, but hopefully people will support the effort by getting out there and buying printed copies.

The Amazon book link is located at:

http://www.amazon.com/Things-Every-Software-Architect-Should/dp/059652269X/ref=pd_bbs_sr_1?ie=UTF8&s=books&qid=1233780733&sr=8-1

The original Web Site is at:

http://97-things.near-time.net/wiki

Positive comments, feedback and reviews would be appreciated.

Saturday, February 21, 2009

Building in Quality

Today I'll try to be terse. Why? I don't really know.

Computers have only two things: data and code. Whatever is one, is not the other.

Code is more than just the primary programming language. It is any type of instruction, explicit or otherwise that instructs the computer to do something. Sometimes it's not obvious. Configuration files, for example, are an implicit programming language to load bits of embedded data into a running program. The syntax wraps the data and tells it how to load.

Shell scripts, document layouts and micro-languages are more examples. Including design, development, testing, packaging and deployment there are a huge variety of different coding types involved in even a trivial system.

Quality is misunderstood. Perfect is the highest quality, from there it goes down past bad. Quality comes from materials and manufacturing. Quality comes from syntax and semantics. Bugs lower quality, but so does bad graphic design or poor usability.

Increasing effort, increases quality. Always. With humans, it is more obvious. If you take your time and do it well, it most often it will be better. If you go back over it a bunch of times, like 'editing', that will help too. With robots it's subtle. The work is done once, and it is either right or a defect. Since nothing in the world is consistently perfect, the robotic work varies somewhat. Quality control weeds out the imperfect items. Better QA, more items are dumped, quality goes up, but so does the cost. Increasing effort (weeding out more), increases quality.

Bigger pieces have smaller errors in between. If you build a watch with 600 pieces, there will be more errors than if you build one with 3. Overall the quality of the three-piece watches will be better. Assembler programs -- a set of smaller abstractions -- has larger bugs than Java. Corrupt stack, bad pointers and leaking from memory management are common. As the abstraction gets larger, the size of the bug gets intrinsically smaller.

Abstract programming is trying to build bigger pieces out of the original pieces from the language. The "abstraction" is a bigger self-contained piece. A bunch of them will increase the quality of the system.

Brute force programming, by pounding out all of the domain level code, as non-abstractly as possible, will be tied to the intrinsic level of quality associated with the language. With way more code, it will also have way more bugs, and be way more inconsistent. More code, and smaller pieces is always bad.

The more you use something, the more it's flaws will become obvious. If you write some code that is used only once in the system, it will be harder to find the bugs. If it is used hundreds of times, it is easier.

If code is leveraged, it is implicitly tested. We know strcat works correctly in C, because it's used in millions (billions?) of lines of code.

Twice as much code is more than twice as much work. It doesn't matter what type of code it is, it has inter-dependencies that make it increase in a non-linear way. Duplicated code, even if it is different types of code, are still duplicated. In fact duplicate, but different types of code, is worse.

The significant bugs in a system come from the gaps between pieces, not from the abstract pieces themselves. If you've decomposed your system into fifty consistent 'things' that need to be written, the things themselves are well-defined (and easily provable). It's the space between them that has most of the bugs (and all of the really bad bugs). Techniques that heavily focus on inner-piece quality (for example, unit-testing) focus on the easier of the two problems. The significant errors come from what is in between the units, and only show up during integration. Testing at higher levels is more effective.

A QA department that always receives good quality code, will never get tested itself. Will eventually let a bad bug through. Will be a waste of money. Testing, after all, is a less rigorous form of code. Manual tests are still programming, even if the 'machine' isn't, and the execution is more hopeful than deterministic. Can't tell if testing is working, if no results produced. Testing that produced no bugs is a waste (or defective).

All programs have bugs and interface problems. All programs will always have bugs and interface problems. Short of mathematics, nothing in this world can be perfect (by definition). Plan on it. Increased quality is good, but better support is way better.

Increasing the size of the pieces, is the only reliable way to decrease the amount and size of the bugs (increase the quality). Programs are usually too big to benefit from 'editing' (code reviews or recoding twice with test files).

Some of the pieces should grow, and some should not. If you've expressed the domain problems in terms of underlying pieces, some of them form a consistent level of abstraction, and thus are fixed. Some of them form knowledge chunks, and as they grow larger, the upper layer can be simplified (and thus indirectly made to be of more quality).

The cost of programming a line of code in a system is similar to pricing a financial bond. It's the initial time it took to create, plus all of the times that the line needs to be revisited over the years. If the line has other dependencies, they must be revisited too. Thus the actual cost can only be estimated as a series of code-visits through-out some extended period of time. The series ends when the line is finally deleted. You do not know the full cost (accurately) until the code is no longer in service, although you could estimate out the flows over a period of time.

The more you visit the code, the more expensive it is. It's cheapest if you could just create it, then get it right into production. No fuss, no muss.

Fixing the quality of the code, doesn't fix the quality of the design. Just because there are less bugs, doesn't mean the program sucks any less. A good program with a couple of bugs, is better than a badly design one.

Users know what their job is, not what they want from a computer. If they had the answers, they'd be software developers. Software developers are the experts at turning vague notions into data and explicit functionality. The users should layout a rough direction, but its up to the developers to make it consistent and usable.

Listening -- too much -- to the "stakeholders" is the same as "design by committee", and will guarantee that the system is poor. An expert software developer is someone who can take (with a grain of salt) all sides, massage them, and produce a consistent design that really solves the problems (or at least increasingly automates them).

Pushing all of the choices back to the users is a way of mitigating risk only. If they told you to make it red, then you can't be blamed for making it red, can you? if you stop caring about blame, then you might endeavor to find out 'why' they want it to be red.

Shorter development iterations increase cost but mitigate risk. If you develop the pieces a bit at a time, then if you're constantly checking that you've not gone off the rails, when you do, there is less work to unroll. Less time that is lost, but way more cost. Short iterations are a form of insurance (and depending on the ability to absorb risk, are more or less vital).

The system is 'rapidly changing' only from the developer's perspective. Most users have been doing their job, mostly the same, for years. More or less, they don't change that much, that often. Most of the changes experienced by software developers, and there are always a lot, come from not really understanding the domain to begin with. Developers rush to judgment, and often make huge mistakes, compounded by a stubbornness to not want to admit it.

The ability to implement rapid changes inspires frequent changes. It's not good. If you make it easy to change things, people will make more random choices. If they don't need to think hard about it, they won't. People will always follow the easiest path. Mainframes are the most successful software platform, because they are the slowest and hardest to change.

There is more artificial complex in most computer systems, then there is real complexity (domain or technical). Mostly, in the current state of the industry, we've been responsible for shooting ourselves in the foot an incredibly huge number of times. One could easily guess that there is some equivalent system that is at least one tenth of the size of most existing systems. That is, 90% of most systems comes from the ever increasing mountains of artificial complexity that we keep adding to our implementations. Mostly it's unnecessary. And it's origin may be so deep in the technical foundation, that it is impossible to remove or fix (but its still artificial).

Solving technical problems is easier than solving domain ones. That's why areas like OpenSource focus their efforts on technical problems (it's also more fun).

New technical solutions require prototyping. New domain ones do not, but they may be a good reason to increase the size of some of the underlying pieces. Code is just code, unless you've never tried that technical approach before, but if you have it's easy (and estimatable).

And finally. Most of the quality issues that a regular programmer can solve come from consistency, effort and self-discipline. Increasing quality for most code, comes directly from just "editing" one's own work. The more you edit, the nicer it will be. Once you've gotten past that, abstraction is the next big leap into better quality. Simplified, six normal form designs, have intrinsically better quality built right into their construction. They are harder to build, and slower, but they are cheaper overall. Quality at an industry level can be improved, but it will take a new generation of higher level abstractions to get there. Languages and tools that hide "things" like threads and data-types for example. Programming paradigms that don't depend on non-isomorphic mappings to reality, but offer syntax closer to the real domain specific problem descriptions. A new way of thinking about the problems, one that reflects our understanding of the sheer size and scale of coding.

OK, I do know why. But it would take too much space to explain it, and way too long to edit it into something with enough quality to be readable :-)

Thursday, February 12, 2009

Maneuverability and Sales

Two quotes to start:
"There's a sucker born every minute"
P.T. Barnum (1810 - 1891)

And more importantly:

"Up until maybe a year ago, I had a pretty one-dimensional view of so-called "Agile" programming, namely that it's an idiotic fad-diet of a marketing scam making the rounds as yet another technological virus implanting itself in naive programmers who've never read "No Silver Bullet", the kinds of programmers who buy extended warranties and self-help books and believe their bosses genuinely care about them as people, the kinds of programmers who attend conferences to make friends and who don't know how to avoid eye contact with leaflet-waving fanatics in airports and who believe writing shit on index cards will suddenly make software development easier."
Steve Yegge, Good Agile, Bad Agile (2006)

Sure Steve was a little harsh in that second quote, but there are three huge facts that people keep blindly skipping over lately:

- Programmers are a sizable market (with lots of money).
- There are lots of people trying to make their living exclusively from selling to programmers.
- Most highly visible people are visible because they are selling something.

Ultimately, this means that for many of the people who are outspoken in the programming industry, some of what they are saying is good. Some of it however, is just filler that they made up one morning in order to insure that their cash flow doesn't dry up.

You, the programmer are just another client, and they, the writer, author, consultant, adviser, or whatever other title they have, has identified you as being their potential customer. This means that they will tell you want you want to hear. If it's right, that is nice but it's not obligatory.

For all those people out there, saying with such confidence that person X said "blaa, blaa ...", you can't necessarily take what they said at face value. It might be right, but then again it might be wrong. And it doesn't matter if they've had an unbroken record of being perfect for twenty years, this latest direction could be entirely full of it. Since most of what they said in the past was subjective, that too, could be equally be full of it.

But a simple analogy might be more helpful. If you go down to you local mega-drug store, you'll find a huge number of products. Each and every one will have a label espousing it virtues, and many will make huge claims. Some of these products will, of course, work as directed, and the effects will be as described. But for many of these products, they won't nearly be as effective as their advertising. And for some of these products, they'll be down right misleading or dangerous.

This happens of course, because the really big drug store doesn't actually care about its customers. Sure they have a marketing campaign telling you they do, and if something goes wrong, it will ultimately hurt their bottom line. But they only really care about making money, not about helping people. Hurting people is bad if and only if it makes less money. It's not a terribly pretty ethic, but it is what it is.

For someone, anyone really, who makes their living through the promotion of ideas with regard to a discipline, the same is absolutely true. They are interested in good ideas, in so long as they make money and they can sell them. Good ideas that are unsellable, because they are "owned" by someone else, or they're too obvious, just don't have value.

Once a market gets established, the players all have vested stakes in making sure it continues. We certainly see this happening in programming, with a new (ish) wave of lighter processes selling a mass number of books, conferences, training, consulting and a whole host of other profitable spin-offs. It doesn't matter once it gets going if it makes sense anymore, that's no longer the point.

It's easier in a immature discipline like software development, since most of what passes for advice is subjective rules of thumb. People don't know, so you can state anything that sounds plausible, and make a reasonable argument. Of course, being subjective as it is, it certainly by definition isn't going to make a huge difference. If it did, then it would be objective, irrefutable and quite possibly unsellable. That's just not profitable. If it worked as advertised and solved all of the problems, it would dry up the market wouldn't it?

Of course, most people selling aren't that petty. But even if they aren't particularly out to get you, there not necessarily out to help you either. They're out to make a living, that's what is important to them. And those motivations make it really easy to step on their toes. Often with negative consequences.

You know you are really dealing with something questionable when its practitioners defend their subjective claims mostly by means of personal attacks. That's the ultimate low point. A common defense is to state that if you haven't "done X", then you can't possibly have a valid counter-argument. Nonsense of course, as we humans have great imaginations and understanding, which often allow us to truly understand things even if we personally have not experienced them. A claim, of any type should be able to stand up to a few though experiments, if nothing else. Good in theory does not mean good in practice, but the converse isn't necessarily false. If it works, you can talk about it, and understand it, even if you've never experienced it yourself (the only real case where you can't is emotions). You can always generalize from the real world, you just might not be able to compact it enough.

Things that are good, are intrinsically good. They will stand up to argument, and they can be discussed even if it's hard to entirely rationalize their substance. Sometimes things are good in a specific context, but don't generalize well. A time or place may have contained other ingredients that aren't properly being account for. Still, its worth investigation to really see why the reality differed from expectation, usually that is a sign of something else lurking about.

Sometimes the right answer is not the immediately obvious one. Sometimes an answer is marginally better, but still not good. Sometimes it takes a long long time to work through all of the bad answers first.

The road ahead in software development is filled with an untold number of bad theories and false starts. That's the inevitable truth in any maturing discipline, yet it's one that many software practitioners seems unaware of. Just because something is appealing or sounds good, doesn't mean it is good for you. That's obvious in food -- chocolate cake for breakfast every morning will eventually kill you -- but it's also true in programming practice. The best one can expect is that we take some time to think about each new approach, and if necessary try it out somewhat. Buying into everything, at the 100% level is a recipe for disaster. The goods being sold are just not that reliable.

So what about me? The truth of the web is that if you take a hard stance on an issue, you better expect to get some negative feedback, and in our industry they'll go for the throat. I'm not out to interfere with anyone's right to make a living, I'd do so myself if I thought I could write and lecture well enough. But I don't, and I don't think people should take what I say at face value either. I just put it out there to get the discussions started. It's important, I think, that we take a hard look at what we are doing, and really genuinely explore what we know and the alternatives. The only thing I am really sure of, is that we haven't found it yet, whatever "it" is.

I really do think that so many programmers expressing their popular opinion in blog comments have forgotten or have never know about "Caveat Emptor "; Latin for "Let the Buyer Beware". No one in their right mind would advocate Coke as a medicine because of their slogan "Coke adds Life". It's a soda pop, we know to take their marketing claims lightly. The truth in the software industry is that much of what passes for industry best practices, old and new, isn't much more trustworthy than a Coke commercial. It sounds nice, and it helps sell. So, it's very disconcerting when people quote it like gospel. Getting back to my earlier example, just because a big drug store sells it, doesn't make it work. Most people know by now to be skeptical in a drug store.

When the pundits are pounding each other back and forth over issues, never forget that you are the buyer -- this show is for you -- and that some of what they are saying isn't worth paying for. There is good advice buried there, but you can't utilize it effectively if you're unwilling to admit that some of it is crap as well. Don't take the marketing on the package literally.

Tuesday, February 3, 2009

In Expression

The other day I was reading a recent issue of National Geographic. It was a story on Charles Darwin and the author, David Quammen was speculating about when and where Darwin finally came upon his famous theories. I found it interesting, since I could easily imagine that just prior to Darwin's 'aha' moment most of what was circulating around his head were vague notions of some hazy concept. Pieces sure, but not the whole thing, and certainly not a refined version. Ideas don't just pop into heads that way; as complete pieces. He, in a sense, was beginning to formulate the knowledge, but he had no way of expressing it.

The big difference between a partial understanding and really 'getting' something is being able to express it in a simple understandable manner. You might have a reasonable amount of knowledge around a specific topic, but when you sit down and try to explain it, the gaps suddenly become extremely noticeable. You know something when you can express it. And you know it really well, when you can express it in multiple ways.

Those ideas of thought interest me because we can apply them to Computer Science too. I.e. you cannot write what you don't know. A programmer flailing away with only a vague notion in mind will not be successful by definition. If they don't know what they are writing, it will not work, they can't express it.

Even more interesting, is that in some sense, certain languages are going to help in assembling ideas more quickly. They often talk about how Inuits and other far northern cultures have a huge vocabulary for snow, I'm not sure if that's really true, but certainly many natural languages have been influenced directly by their environment. In that sense, a speaker in a specific language with more appropriate words is more likely to be able to cross that knowledge gulf to the other side and express their ideas more clearly.

The elements of spoken knowledge -- our vocabulary -- assist us in understanding them.

But even with a limited natural language, if your vocabulary is wide and domain specific it certainly makes it far easier to extend your underlying knowledge to new points, particularly if they are not too far from the initial ones. If you know how to express several big ideas, you can build on them to create even bigger ones.

Expression, then is more that just formulating a correct syntax. It's finding a suitable arrangement to communicate something complex, whether to another human, or to a machine for execution. In a human sense, it's about taking those vague threads in one's understanding and actualizing them into a coherent stream of information.

In a computer sense, it's about taking those vague notions of possible functionality, data and user needs, and actualizing them into a complex structural form in multiple computer languages as a system.

How we ourselves assemble the bits to create knowledge is similar in many ways to how we as system analysts assemble the parts to create specifications. Both bring order to the chaos. Both actualize vague notions.


EXPRESSIVENESS

If you've set out to create a theory of evolution, even if you haven't named it that, the first thing you do is to collect as much underlying related information as possible. A trip around the world would do, for example. It's on those base facts that you'll build your ideas.

In creating a big software system, the designers and analysts set out with the same goal. They, on deciding which problems to solve, collect a huge amount of base information in a vary domain specific format. If you talk to enough people, preferably experts in the domain, from all of the different perspectives you can assemble a deep and complex picture on which you can build.

Given that effort, it makes the most intuitive sense to want to do the absolute least amount of translation necessary in order to express that domain-specific understanding into a form directly usable by a computer. Translation is inherently dangerous.

We've collected the data in a domain-specific format, shouldn't we try to utilize it there as much as possible?

That translation work, often ignored by language developers underscores the success that languages like COBOL and APL have had over the decades in forming the basis of many systems.

COBOL is a particularly verbose and clunky language, but for your standard business application it fits well with the domain, minimizing translations. COBOL was certainly one of software's most popular languages, and it's highly likely -- given the failure of most modern technologies in displacing the older, cruder, yet way more stable mainframes -- that it still accounts for most of the data, and certainly most of the world's mission critical data (your bank accounts for instance are likely held by a mainframe with COBOL, if they're not you may want to consider changing banks).

Similarly APL, which is a matrix-oriented language was hugely popular with those disciplines solving mostly matrix problems like actuarial science, better known as insurance. Translating from the natural mathematical domain of the problems into some procedural or object-oriented paradigm opens up considerable dangers. Translating to something closer to the domain is considerable safer. Bugs come from mis-understandings, but way more bugs come from mis-translations.

It seems rather obvious that we'd like to avoid translating our domain problems into other more complex formats, but we keep pursuing technologies that show this feature very poorly.

Software development is about solving technical problems as well as domain ones, and so much of the industry prefers to tackle the technical problems. They are simpler, more straight-forward, and generally black and white. Domain problems are bigger, uglier and often very grey. With that difference, it's no wonder the technical problems hold more of a fascination for programmers.

Unfortunately a purely technical solution solves no real world issues directly on its own, they all need to be embedded into domain specific solutions to find their real value in this world. The trouble comes, not from a technology like a database, but from how we use it in our customer relationship system.

Not suprisingly, most of the modern popular languages focus heavily on solving technical problems, while absolutely ignoring anything in the domain spectrum. The best languages however, make up for this a bit by allowing themselves to be extendable. The programmers then, if they put in the effort, can tailor the language to become more domain-specific, hopefully encapsulating the technical and lower-level domain details deep into the foundations, away from most of the other coders.


QUALITY

If you were going to write an academic paper, you'd be very careful in choosing your language. Most disciplines have evolved over time, so there are well-known concepts that everyone uses in order to work through the mechanics of their problems. The denotations and connotations of the underlying terminology grow ever larger as each new paper builds on a continuing theme. In that way, the pieces get bigger and bigger.

But, assuming that the underlying papers survived time and peer review, the quality of the upper levels of work also gets intrinsically better. How? As the terms grow, and become larger generalizations, the pieces become far more static. That is, they are harder to put together in an incorrect manner. If you're using the right terms, in the right way, as previously defined, their depth means that the allow permutations for interconnecting them is reduced, otherwise you'd be violating the definitions. You're peers, presumable would notice this right away.

That also applies to computers, although for programming languages it is a lot more obvious. In assembler for example, a programmer might easily forget to push or pop something on the stack, causing a bug. Skip up to the higher abstraction in C, and the compiler itself does all of the pushing and popping, eliminating most, if not all stack problems. But, at that particular abstraction level, pointers can be easily manipulated. Thus, C programs suffer horribly from a lot of loose pointer errors. Memory management is also up to the programmer, causing another common set of bugs.

As we work ever higher, the lower-level problems disappear, but new ones surface, although smaller and less debilitating. Java for instance cannot have a loose pointer, and although possible, it is far harder to leak memory. On the other hand, threading problems are rampant, and the systems are so big, bloated and bulkly that they've exceeded the growth of the hardware. The problems are still there at the higher level, but they've become way smaller.

It's far more likely that a group of programmers will get a reasonable system done in Java, then it is that they will get the same one done in assembler. While it's possible, a system in assembler even half the size of a modest Java one would be an uncontrollable bear to keep running. Way, way too much work.

What does this have to do with quality? Well, those problems seen as indirect inputs into the process of building a system are a natural byproduct of the constructive process. Bugs, I am saying, are impossible to avoid. Bigger bugs, are presumable harder to find, and more work to fix.

If an underlying step up in abstraction, almost by definition, causes smaller problems, then it is also indirectly taking the system closer to a higher quality. Although not entirely linear, 4 huge pointer bugs are a far harder and more time-consuming problem than 30 little typos. If your language doesn't allow pointer bugs, and hasn't nicely replaced them with some other equivalent bug like threading problems, then that step upwards comes with a noticable step up in quality.


ABSTRACTIONS AND THINGS

Although its obvious that a higher-level abstraction will increase the quality, it is not always obvious that a 'different' abstraction is a higher one. Java programs and C programs share an instability, although their underlying causes are very different.

But the paradigm itself, as an aspect of the programming language may also play a big effect.

We deal with most things in our real world in a 4 dimension sense, and as I've often said it effects the way we structure things and our language itself. The object-orient paradigm is a way to model the world around us as a series of atomic 'objects' each made up of some code and some data. This particular model mixes the limited expressibility of static data, with the more broad expressibility of running code in order to provide an atomic element that is flexible enough to span our common 4 dimensional functional space.

In a sense, it breaks down every element in our world, in a model of a 'thing' (data) in 'time' (code).

That model, it turns out, can be applied to all things in our world, but most people applying it think that the code attributes should be utilized properly. That's nice, but much of what we seek to represent in a computer is happily static data, of the very boring 3D kind.

An inventory system for a restaurant for example, need only keep track of what's in and what's out for the current time period. I.e. the 4th dimension is not used or particularly desired for the system to run. Modeling an inventory system with no time constraints in an object-oriented framework forces the designers to have to translate pretty simple data, into dangerously, state-driven objects. A complex translation that we know how to do, yet one that is done incorrectly, often.

The generalness of the object-oriented model imposes an order of complexity on any problems where that specific model is not necessary. That, it turns out, is a lot more places than most people realize. History -- time in particular -- is not often applied to modern computer systems, and rarely applied well or consistently. Which means, that for the bread and butter of modern systems we are going to a huge amount of extra work, trying to force our real-world view into a paradigm that makes it way harder to express.

It fails often, as one would expect it too.


LANGUAGES

Ultimately, we'd like a collection of technologies that makes it easy to express the various different problems we encounter frequently with software development. We want the underlying solutions to be generalized, but we want to tailor specifically what sits on the top to be very domain intensive.

Many large organizations find that their systems provide an edge of competitiveness, so there is always value for any corporation in distinguishing itself through process. Since the computer systems are coming more and more often to define the process, they still represent important competitive areas.

The closer we are to expressing things in their domain-specific formats, the closer we are to massively reducing the presence of bugs. Certainly, it takes some of the fun out of programming, people love to struggle with overly complex-logic and fragile systems, but as an industry we have to move past having our low success rate rooted in our passion for hacking. Sooner or later, someone is going to figure out how to make it more reliable, so sooner, hopefully, the techniques in building things are going to change.

Going back to languages, C programmers used their own code or libraries to manipulate complex data structures like hash tables. The expression of access to a hash table was through pointers, so consequently, even if the underlying library was solid there were many problems with getting hash tables to work in a typical C program.

Perl on the other hand has hash tables, also known as associated arrays, built right into the language. This higher-level paradigm means that the Perl interpret can perform semantic as well as syntactic checks on the code, allowing the language to prevent the user from utilizing it incorrectly. C, on the other hand, knows nothing of it's programmer's intent, the hash table code is just like any other code in the system. The syntax can be checked, but only in the most minimal sense.

Does that matter? Associated arrays as an expression paradigm provide a large set of ways of handling complex problems, that are more difficult in a straight-forward functional language. Fluent Perl programmers can write smaller, better quality solutions for a specific class of functions, that would be far harder and more volatile in languages like C or Java. Text processing for example, can be easily scripted, to help summarize or transform data. For some technical problems, and some domains that are plagued by smaller data collection variations, utilizing Perl can be orders of magnitude more efficient and produce significantly higher quality. It just comes intrinsically from using a more appropriate language.

Although Perl is a more complex language than either C or Java, for specific classes of problems it is a more appropriate one. That fact is true for just about every language as well too. Each and every one, with it's own syntax and paradigm has areas of greater competency, which always means time and quality.


FINAL THOUGHTS

I've written no less than three different versions of this post. Each one grinding to a crashing halt, as the text becomes hopelessly incomplete and lost.

Like Darwin, before his ideas became clear, the notion and sense that there is something extremely important about how and what we express is filtering about in my head. I feel as if I am just circling around the outside of some deeper understanding. Expression, as an issue to Computer Science is far more important than whether Ruby is better than Java, or Python is prettier than Perl. It's more tan static verses dynamic typing.

It pops up, over and over again, and I know that it is the key to getting to that next plateau for us. The technologies we currently have impede our expression in the same way that a crude language like Klingon or Elvish would make it hard to express some extremely complex scientific papers. It might, as if programming in assembler, be possible to pound out the full and entire text, yet the underlying pieces are too small and too frail to allow that sophistication.

Our next leap in technology, which we need soon, comes from examining our current abstractions, and finding an extension or even something completely different. More so than the C to Java leap, we don't just want to trade one problem for another. We want another Assembler to Java leap were we will well and truly bury a lot of complexity in the underlying levels that will never be seen again. In that way we can move forward and build the size and complexity of systems that will finally utilize a computer for what it can really do.