The Programmer's Paradox: March 2009

Sunday, March 29, 2009

About, Next and Garbage

Trouble often starts trivially. It's those little subtle irrationalities that most people don't think worth investigating, that gradually percolate their way into bigger, more significant issues.

Most professions accept these inconsistencies as the bias of their conventions. Software developers, however, straddle the abstract mathematical world of computing machines, and the grittiness of the real world. Those tiny little things that others might miss become essential for us to understand, or we end up watching them slowly eat away at our efforts.

A fascination with the trivial is a healthy one for someone whose income depends on mapping the real world down to a deterministic set of symbolic tokens.

The mathematical world is clean and simple. We can get so used to it that we hopelessly try to project it back onto the real world. Why are things so messy? Why can't people get organized? Why do stupid mistakes happen, and then not get caught early? We mistakenly seek to apply order to the chaos.

When you spend all day contrasting these two very different worlds, it is easy to fail to appreciate the depth of the real one. But keeping the two separate, and having separate expectations for each one, makes it easier to correctly map back and forth. If we're not trying to jam mathematical purity onto the real world we're far less likely to be disappointed by the results.

Still, where and when things are strange, erratic or inconsistent, we need to pay closer attention. Computer Science begins and ends with trivialities, they inhibit the world around us and we have to learn to deal with them.

In the following simple examples, I'll start with three really trivial issues, then get into two more that are domain-specific. I will initially present each one of the five as simply as possible, with little or no conclusions. Each of them is trivial in their own right, yet each, in their own way are examples of the very small things that work upwards, causing greater issues.

GARBAGE

Garbage day comes once a week. Thanks to diminishing space at toxic landfills, we're gradually separating more and more of our household garbage into specific sub-categories. At first, it was just glass, then metal cans, then cardboard and more recently compostable food materials.

Over the decades, garbage has gotten considerably more complex. What started with simple garbage bags, morphed into garbage cans (to save the bags), blue boxes for glass and cans, grey boxes for paper and finally green bins for food scraps. Recently, though the grey boxes have been dropped, their contents now allowed in a newer merged blue bin.

Since it is expensive to pick up all of the different types each week, they alternate on a weekly basis. Every week they pick up the green bin, but they switch between the recyclables one week and the rest, the week after. To make it faster for pickup, and to allow them to charge us more money they provided new (huge) bins to all of the people, nicely labeled 'garbage'.

So, now garbage day is either a recycle week, or a garbage week. Sadly, the term garbage now means both the overall stuff we throw away, and all of the stuff we throw away that isn't something else (a wildcard). We have two distinctly different uses for the word garbage.

NEXT

"Let's get together next Saturday" seems like such a simple sentence. If today is Tuesday, for many people this is an invitation to meet in 11 days. The closest Saturday is often referred to as "this Saturday", while the proceeding one is "next Saturday".

Of course, the dictionary definition of the word 'next' likes to use the term "nearest" in its definition. If you were sitting on a chair, the "next chair" would be the one you could reach out and touch (if it were close enough). Your next-door neighbor is the house adjacent.

Some people -- but they seem to be in the minority -- would take that initial invitation to mean a meeting in 4 days, instead of 11. Next for them, really is the next thing, regardless of position.

Of course, it might just be that "next Saturday" is a short-form for something like "next week, Saturday". There might be some reasonable historic answer for why next doesn't reference the nearest thing but instead references something two positions over.

ABOVE

Once, while playing a guessing game, I was asked if the name of an object started with a letter above or below F.

We read text from left to right, top to bottom, so if we were to write out the alphabet, it would be in that order. On a very narrow sheet of paper, A would start out on top and the letters would descend, one after each other. In that sense, A is clearly above F.

In a computer, however, the letters are all encoded as numbers. A starts with relatively low value, and it gets larger and larger until we get to Z. In a numerical sense, it is not uncommon to take a number like 28 and say that it is higher than 2, which implies that it is above 2. In that sense, as encoded numbers, Z is clearly above F.

So is A above F, or is Z above F?

BOND CALCULATIONS

Bonds are a fairly complex financial instrument that is used to indicate a debt between two parties. One party essentially borrows money from the other, at the terms specified by the bond. Because the instrument is generic, the money loaner is free to sell the bond to other holders, if they no longer wish to keep it (they may need the underlying cashback to do something else with it).

Lots of people like buying bonds because they are a reasonably safe and consistent instrument. There is a big market. However, for the sellers (borrowers), the cost of issuing a bond can be significant. They are interested in raising the most cash they can for their efforts. Their focus is on finding new ways to get the buyers to over-value their instruments and pay more. On the flip side, with all of the buyers out there, bonds with more complicated features are likely to be under-valued. Strangely this makes buyers interested in them, also hoping for a deal. Hoping that they are under-valued.

As such, the bond market is continually turning out new types of bonds with increasingly complex features, so as to obscure the price. Both buyers and sellers are placing small implicit bets that the other players can't or don't know how to value the instrument correctly. Even after hundreds of years, there is a steady stream of new types of financial instruments getting created. The change, and occasionally the scandals caused by gross under-valuations of instruments (like a CDS), are important parts of keeping the pressure on the markets to balance the real and perceived value of everyone's investments. Gradually, people always start believing that things are worth a bit more than they really are. We are optimistic by nature.

PRICE QUOTES

Big companies have to buy things, often in great quantities. In an age where everyone is looking for the deal, many different industries have grown up to supply these needs. Since a large company holds a tremendous amount of purchasing power, they often use that as a tactical weapon against their own suppliers.

In industries where this is common, the suppliers generally create individual and specific quotes for all of their available goods and services. Quoting is a time consuming manual process, which may seem out-of-date in the computer age, but there is an underlying need for it.

Most suppliers have quantitative price breakdowns for their wares, or at least they have some reasonable idea about how much they need to charge in order to stay in business. Those numbers are nice, but with their clients occasionally and inconsistently trying to force through a bargain of some type, suppliers have to continually make up for lost revenues.

Thus the prices for most items fluctuate depending on how negatively or positivity the business dealings with the company have been. In short, if a big company forces its supplier into a deal, the supplier will record that, and eventually, the supplier will recoup the money (and often lots more). There is constant tension on the relationship, as both parties try to outmaneuver each other.

FROM ABOVE

Getting back to 'garbage', we see that in building computer systems it is not uncommon to come across terminology that while matching domain conventions, is horribly inconsistent. it just builds up that way.

The domain experts get it, yet an outsider would have little hope quickly detecting the ambiguities. We step on these types of landmines frequently, like natural language, and our sciences are founded on a base of massive inconsistencies.

Even in a new discipline like Computer Science, it is still not uncommon to find a definition coming from an established work like "the Mythical Man Month" by Frederick P. Brookes, to be using Aristotle's ancient definition of the word accidental, rather than a more modern one. The differences in meaning make a significant literal different from the ideas.

Even with respect to some term as simple and obvious as 'next', we do not come either to an agreement or a consistent definition. It is well defined, yet not used that way. If most people use a term in a manner that disagrees with its definition, the convention easily overrides the definition. But when slightly more than half do, it becomes complicated. A significant enough number of people have a more intuitive definition for "next Saturday" so that it is an exceptionally risky way to specify a date. The term is ambiguous enough to render it useless.

Relative terms, like 'above', are especially at risk since their conversion into an absolute piece of information can easily be tied to perspective. If the term is not used heavily, and it has at least one intuitive meaning, we have to be careful that we're not assuming that our general understanding of it, is the correct one. Because of this, it is always advised that we specify everything in absolute terms if we want to make sure there are no problems. Relative phrasing courts miscommunication.

Even if we're not getting lost in multiple contradictory definitions, the real world holds a tremendously large amount of inconsistency and irrationality. Things in the mathematical world are clean and simple, yet the real world, even with simple issues is deep and complex.

The underlying nature of the bond industry, for example, forces a constant inherent complexity over the whole process. In order to give the various different parties advantage over each other in extracting money, the underlying financial math is needlessly complicated. It's a game in which both sides are equally guilty in wanting it to be confusing. Both are hoping to win.

Sometimes, the real problems come mainly from one party, as in pricing. If it wasn't for a steady supply of aggressive executives out to make their careers by gauging deals from suppliers, the pricing could be considerably more rational. While the excutroids get their deals, the suppliers often win bigger in the long run and use their client's nasty tactics as an excuse to over-charge. There is little incentive to stop these types of dealing on either side. The system is founded on an irrational, steady stream of price haggling. Most pricing has a few irrational components built-in.

Business itself -- the more you dig into it -- is essentially irrational. Or at very least, like whether it is so inherently complex, that one can understand the big picture while still not being able to predict whether or not it will rain the next day.

For all of the big business theories, courses, management consultants and universities claiming to understand and quantify it, the whole system always migrates back to an irrational state of being. It does this because if it didn't then it would be really easy for every party to win, which means that none would. A rational business environment would be a fair one for all, but that does not suit any smaller subset of people in business together. Fair does not equal profits.

FINALITY

Software development is about creating computer systems to build up large piles of data. Done well, we can use that data to automate and improve many different aspects of our lives.

But in building these systems we have to dig deeply into the underlying domains and try to impose a rational and deterministic structure on top. This is no trivial feat because all of this domain information is rooted in a long history of messy and irrational elements.

It is great that we get to dig deeply into other people's business, but it is also surprisingly frustrating at times. The deeper we dig, the scarier it becomes.

The biggest and most common mistake software developers make is confusing the 10,000 feet view with actually understanding the underlying domain.

Any and all of our assumptions are dangerous and have a high probability of being wrong. To get past this, you have to start by not expecting the trivial to be trivial. If you are always willing to accept the possibility that a misunderstanding has crept into the picture, then you're able to prevent yourself from being too fragile. Assumptions, even simple ones, are the source of most analytical errors.

Still, most professions constrain their practitioners into just keeping up with their own industry. Software developers, however, to survive year after year, have to both be experts in building systems, and also experts in a specific domain. General programming knowledge is good for the first five years of a programming career, but it's the domain expertise that makes it possible to last longer.

Unless you're only interesting in building software for a fraction of your career, it is important to start building up significant domain knowledge. Even, if like myself, you find yourself skipping back and forth between radically different domains, digging often helps to give one a good sense of the world around them.

Sunday, March 22, 2009

Faux Stylin'

Our brains have two distinct hemispheres. Some people believe that we are almost two different people living in the same body with a minimal connection between. The left side of the brain is the rational intellectual one, while the right side is creative. This split is important to our world-view. I've badly paraphrased these ideas, which are described quite well in the classic reference:

http://www.amazon.com/Drawing-Right-Brain-Betty-Edwards/dp/0874775132

Irregardless of our physical structure, we all have an inherent duality in our nature between our intellectual side and our emotional side. The two are oil and water, mixing as poorly as our concept of right and wrong so clearly fail to mix with the law of natural selection. Intellect is a higher, abstract perspective that sits above the grittiness of the real world. A while ago I finally learned to unify the two, by not bothering to try anymore.

Art -- at least the really compelling stuff that captivates us -- is the act of somehow almost magically embedding some emotional context into a work. Great songs move us emotionally, great paintings touch us deeply. The very best films we've seen leave a lasting low-level emotional connection buried deep into our sub-conscious. Good art encapsulates emotions, even if we can't exactly define what that means.

Somehow it reaches out and touches us deep inside.

Creativity, as I've often guessed, is some sort of intrinsic human flaw. After all, we combine elements that don't belong to get something new. That's clearly got to be a short in the circuitry of some type. The brain's wiring just isn't staying restricted to the "rational" things that it should. Creativity is just some kind of bug, or glitch that we manage to survive anyways. It's not fatal, but that doesn't make it normal.

The glitch can occur no matter what we are doing. So, in that very sense it doesn't matter if it's applied towards things that are emotional or things that are intellectual. Both are really the same as far as glitches go. Either half of our brain can short out at any time. Creativity is not restricted.

Building something like software, while it is composed of a complex set of underlying parts that all need to fit together and interact with each other, is clearly not an emotional exercise. It is intellectual. Other than crying out in frustration, software doesn't generally move us, we simply use it to accomplish our goals. Its byproduct could be artwork, but the mechanics of its existence are just cold and hard intellectual efforts.

Thus, it's pretty fair to say that software programming is usually an intellectual pursuit, however the visual design of that software is not.

Graphic design is an industrialization of art -- work done specifically to make revenue -- yet it is still tied into the underlying emotional context of artwork. A good design or layout retains vestiges of emotional context; that is why we find the design pleasing or correct. It may not have the grand impact of its fine art cousin, but it is still quietly and carefully touching people.

A well-designed magazine is pleasing to look at, not because it is an exercise in intelligence, but because its design, colors and overall layout are esthetically pleasing. There is still an emotional sub-content embedded in that design.

PROGRAMMERS AREN'T GRAPHIC DESIGNERS!

Most professionals know their limits, but software developers have some weird mistaken belief that because coding is intellectually creative, that it somehow translates into their being able to do design issues. To be emotionally creative. That all creativity is somehow the same.

Programmers are notorious for undervaluing most other disciplines. Possibly because we do so much deconstruction work that it is easy to simplify everything to the point where we foolishly think we understand it better than we really do.

And there is no other discipline that we constantly undervalue as much as graphic design. It's not uncommon to run into programmers who are extremely proud of their home-grown butt-ugly interfaces. Butt-ugly rules the industry.

Somehow we like to zoom ourselves into believing that it looks good, when it clearly is hideous. Bad colors, cramped designs, awkward interfaces, etc. These things litter the programming world and often behind them is a proud, but oblivious set of coders. Sometimes even bragging about their abilities...

Because of this I need to make a very STRONG point:

- There is nothing about programming that is even remotely related to graphic design. The two are complete opposites (or at the very least should be considered so).

Being a good or bad programmer is completely unrelated to any sort of ability do graphic design work. One is about precise intellectual work, while the other is about sloppy emotional work.

Good graphic design feels right, for completely non-rational reasons. If you can apply a pattern or create a formula for it, chances are it isn't good graphic design. Good programming on the other hand is clean, simple and obvious. Patterns are useful. There are clearly rules that can be applied, even if as an industry we have not yet progressed far enough along that we can all agree upon them.

And it's not like you can just read a couple of books or take a few course and *presto* you 'get' graphic design. Unlike programming, graphic design isn't intellectual, it's driven by a non-rational emotionally driven foundation. A good sense of style, experience and a mentor are amongst the things needed by graphic designers to learn to connect with themselves on an emotional level and then channel that into creating something pleasing. If it's rational or a formula, then it just isn't emotional is it?

If you can lay it out in a set of simple rules of thumb and follow it blindly, then it definitely isn't graphic design.

SOFTWARE DEVELOPMENT

If you're serious about writing a high quality commercial-grade piece of software there are a few things you JUST do. Some are obvious, like using source code control. They are well-known.

One thing you definitely do is hire a graphic designer to create the interface and overall experience. Artistic fellow programmers -- no matter how hard they swear they have a good design sense -- just do not have the experience or training to do a good enough job in the design. And it's not an aspect of the product that you really want to leave to chance.

In a restaurant, all cooks know that "presentation" is at least half the experience. Poorly plated food is a disaster. The same is true for software applications. If they look good, they are just so much easier to appreciate. Ugly interfaces make for cranky users.

Graphic design is absolutely critical.

Now, that being said, hiring a graphic designer is both expensive and sometimes a difficult issue. You often need to review a number of them, since they all have very distinct styles that you are buying into. Hiring a graphic designer with a retro style to create enterprise applications probably won't help sales. Hiring a corporate-style designer to create an interface for kids won't work either.

It's not easy to find a designer, and in many circumstances the depth of the project makes it unlikely to fund the work either. Sometimes, we just have to wing it. It's unfortunate, but sometimes unavoidable.

Now, before I continue I need to add in a rather hefty disclaimer:

I am not a graphic designer, and I have no idea how to really do graphic design. Most graphic designers would pee their pants laughing at the utter simplifications in the following advice. Nothing I am about to say should be taken seriously as "graphic design", and please whatever you do don't mention any of this at any party that happens to have one or more grouchy graphic designers hanging about, just waiting for a non-designer to foolishly open their mouth and spout some gratuitously anti-graphic design dogma. And in no way should the preceding sentence cast any sort of implication whatsoever that I think that many or all graphic designers are either grouchy or overly sensitive about their profession. I do not. They are generally very nice people, when they are not mad at you.

With that in mind, I'd like to list out a few simple yet key items that can help programmers get around their lack of graphic designer access for a while. If you must go for a time without a designer, then keep the following in mind while you are laying out the interfaces. These small bits of attention to detail will help in making your product suck less, up to the point that you finally give in and hire a real graphic designer to clean up your mess.

GRAPHIC DESIGN ISSUES

1. Apply a Real Color Palette

Colors work together or they do not. It's a little bit science but a lot of emotional madness. Unless you get it, you probably don't get it.

Whichever way, it is very hard to find a group of compatible colors that work together. The larger the size of the group, the harder it becomes. There are lots of web sites out there on the Internet to help. They will give a list of colors in a palette. It might seem easy then to just use one, but you always have to keep in mind the total number of colors. Palettes of 4 are easy to find, but most applications need 7 or more colors for the whole system.

Once you have the colors, they have to be applied consistently. Even though elements of graphic design are irrational, consistency in the user experience is still vital to a good design. Little inconsistencies come across as sloppiness, but not the good retro design kind...

You can also steal colors off a painting or other color artwork, but that's a bit of harder trick to pull off. Don't try this, unless you've been through a lot of color painting courses yourself.

You often see programs that started out with a reasonable design, but then later other people started applying other colors indiscriminately. That's just a quick ticket to a mess. If the circumstances for colors don't fit, then you need to redo it all as one big unit. All the colors need to fit, you can't just throw in an extra one as needed.

2. Don't Allow too Many Fonts

Nothing screams amateur like a screen full of different fonts. And yet it is so common to see this, particularly on web pages.

Of course it's harder than most people realize, because you'll find that many graphic designers take "different font" to mean any variations on the same font: such as size, color, bold or italic, not just differences in family.

Application screens shouldn't be composed of twenty different fonts (or variations on those fonts), it is just ugly. Somewhere I remember reading that 4 or 5 was a good number on a screen, but I'm not sure where that came from.

Everything shouldn't be one font either, since that is also hard to read. Fonts help to draw attention to the different screen elements. They serve a valuable role in helping the users navigate the functionality. They are more than just pretty decorations.

A couple of different font families can look nice together, but a lot of them might set off weird issues. Two is most likely safest.

Also, mostly it is better to drop the serifs. They are dainty ornamentation which looks good in "some" circumstances, but overall it is best to go for a simpler design where possible. San serif fonts are fine for most things.

3. Use Negative Space

Another thing people do frequently is cram the screen with as much crap as possible. That craigslist aesthetic can present an intimidating wall to the users; eventually they might learn to ignore it, but one doubts that they'll ever really forgive you for subjecting them to it.

Negative space is all about the blank areas on the screen that are not being used. A good graphic designer can use these areas very effectively to drawn in the user and position their interest. Again, this is about controlling the individual navigation within a specific screen. Keeping this under control allows the users to get to their intended locations a lot faster and with a lot less aggravation.

Using negative space is very complicated, but in its simplest form it is probably just best to say that non-designers should make sure that some large percentage of the screen is just blank. Great graphic designers might push upwards of 50%, but it probably works for most applications to try to keep at least one third of the screen from having stuff on it. That should keep it from looking too cluttered, while not looking too empty. A delicate balancing act.

4. Create a Design Grid

To keep the same feel across a large system, all of the pieces have to have some overly common elements that are the same. Some designers will create a 'design grid' that becomes the structural framework to hold all of the pieces.

They can be quite complex, and have interesting geometrical attributes. The design might be trying to drive all of the visual elements towards a diagonal line in the center for example, or they make be pushing more simplified compositions. For instance photography likes the 1/3 rule of composition.

Whichever way, the "big" elements in the screen should have positions that don't vary for all of the screens in the same grid. A complex program may actually have a couple of design grids at different levels, but it gets more complex to pull that off.

Keep it high-level, and keep it simple. A design grid should only have a half dozen or so things hanging off it. Draw it out in the beginning, and then stick too it. If something breaks the grid, first try with all of your effort and might to get it to fit. Only go for a second one if it's impossible for the first one to work, but they you have to rejig all of the existing similar screens to match for consistency, which can be a lot of work.

INTERACTIVE DESIGNER

The above items work for static "print" like representations. Many dynamic computer systems have a few other intrinsic issues that come under the category of interactive design, a sort of off-shoot from traditional graphic design that is dedicated towards interactive experiences.

5. Consider Flow and Resizing

The screens change in size, and in many cases the data can grow to be rather large. These changes need to be taken into account in the design grid, so that as they grow, the design doesn't get uglier.

The easiest thing to do is to restrict the flow to only one direction. That is why many web apps have a fixed width and a growing height downwards.

Two expanding directions becomes very difficult particularly if the design grid is based around some proportional representation. A fully re-sizable web application, for example, doesn't even have a default size anymore. Laptops, wide screen monitors and a huge array of different devices mean that nothing is certain for the screen dimensions.

If you need to grow into both directions, then trying to keep some type of even balanced layout is best, yet extremely hard. As the aspect ratio of the screen changes, the characteristics of the screen layout change as well. Diagonal growth is both a complex coding trick and a complex design interaction.

6. Minimize Navigation for High Frequency Operations

History is responsible for far more of the organization when it comes to where a lot of the functionality ended up in most modern software systems. As such you get the "Microsoft Problem", where after years of junior-level independent teams haphazardly adding mass amounts of functionality to overloaded bloated applications like "Word", the structural organization of the functionality -- which started at one time as deterministic and orderly -- is now basically erratic. You never really know if a piece of functionality exists in an application until you've completed an exhaustive search of all of the menus. A problem that is both irritating and wasteful.

The best thing to do is to get real statistics on which functions are used how often and by whom, then use those numbers (and consistency) to drive the placement of the functionality in the application. Of course, as this does change as the application grows, it provides for a strong argument for using very short, loose coupling between the triggers for the functionality and the functionality itself. Growth often drives the need for occasional interface reorganization. Accepting that, means less work in the long run.

SUMMARY

Presentation is half of what makes up a good program. In order to use something well, you have to want to use it, and crappy tools make themselves more difficult. Less inviting.

Many people mis-estimate the complexities involved in making things look good, so they often don't. They delude themselves into thinking that they have the skill-set required to do real graphic design. Just because something isn't intrinsically interatable -- it's impossible to write an all encompassing manual for it -- doesn't mean that it is simple. Programmers know this about programming, but refuse to accept this in other domains.

In the end, an ugly application that is hard to navigate still does the job, but it is nothing to be proud of. Often it's not that hard to get help, and it's not that much work to apply it. For some people, they have to get beyond their own narrow view that it's not hard or it's not complex, or worth the extra money or time or effort to get it right. It's hard, expensive and always worth it. That is, if you want professional results.

If you really think you have graphic design skills -- and you just might -- the only way to know for sure is to start taking a lot of art courses. A lot of art courses; one or two just isn't enough. A raw design sense and some untrained ability is not enough to do a professional job, but it is enough to be able to justify spending more effort learning. Programmers -- especially as they age -- need more than just programming skills to survive; half coder, half designer is a strong combination, but it has to be backed by real knowledge and experience, not just hubris.

Saturday, March 14, 2009

In Framing Work

"Wouldn't it have been better to allow the programmers to attach their own hash table directly into the form? " one of my fellow developers said gently.

"That way they can change the values in the hash table, and then update the form -- it's a more common approach to do it that way" he explained.

I was in the middle of discussing my latest piece of code, a complex generic form handing mechanism built on top of GWT.

It was a good question, in that while I was programming in Java I had clearly drifted away from the Java accepted paradigms, and into some else complete different.

Mostly that is a pretty bad idea. If a platform has an accepted way of handling a specific problem, you need pretty strong justification to choose to go at the code in a different way. It's always about consistency, and picking one well-known way of handling a problem then sticking to it is superior to filling the code with a million different ways to accomplish the same results. Simplicity and consistency are tied together.

However, in this case I was fairly happy with my results. I had deliberately chosen a different path, one that I was consistently applying in the upper layers of this code. It's intrinsic to the philosophy of the architecture. Consistency with the overall paradigm, is usually more important than getting caught up in an underlying language one. Besides, strictly speaking, only half of GWT is Java; the client half is Java-inspired JavaScript.

I chose this path deliberately because it matched my objectives, and I chose my objectives in order to maximize my resources. They are related.

The architecture can and often should reflect elements of the environment in which it is being developed. We are interested in being successful, and the definition of that changes depending on where we are located. A point easily over-looked.

Despite my confidence in the design, the question was still really interesting, because it forced me to think about the various pieces that combined to affect the architecture. I've been at this a long time and some things just become subliminal. I stop thinking about them. You know they work, but you never really move it into the foreground to examine it. Following this question through, leads into a lot of interesting issues. Things worth considering.

FRAMEWORKS AND LIBRARIES

The first issue comes from having a different perspective on code.

All programs have a control loop. That is the main flow of control; often one or more tight loops that continually go over and over again before they call out to the functionality. In a GUI an event loop is a common example. In some batch systems, there might be a pending queue of operations -- a finite control loop, but never-the-less the same concept. All programs have a control loop of some type, even if in a few batch cases it is trivial.

If you're looking at large pieces of code, we can classify them by whether or not they contain a control loop. This is already a common classification, it's just never really been formalized.

Thus, a 'library' of functionality definitely has no control loop. The code enters, and gets out quickly, heading back to the main event loop somewhere. You call a library to do something specific for you.

A 'framework', like Struts for thin clients, or Swing for thick ones definitely encapsulates a loop. Working with that type of code means attaching bits of functionality at various open places in the framework. Sort of like affixing pieces to the bottom of a large appliance. You hand over control to a framework for some indeterminate period of time.

The more normal definition of framework usually implicitly means that there is one and only one global one, but if you're following the control loop argument, there is no reason why we can't have many layers of frameworks. Frameworks within framework, and frameworks embedded in libraries. As libraries layer upwards -- you build on them -- frameworks layer downwards.

From this perspective, we can then assert that every component in a software system either contains a control loop of some type or it does not.

Thus every component is either a 'library' or a 'framework'. That type of breakdown is really convenient, in that we can now see the two as being the yin and yang of all software code. All large code blocks can be deconstructed into discrete components of either frameworks or libraries. This allows us to see the code as being one or the other.

GOALS

For a lot of code -- especially libraries -- amongst the main goals is the desire to allow the code to be re-used multiple times. We want to leverage it as much as possible.

This often leads to a large number of configurable parameters getting added, massively kicking up the complexity. A common side-effect is poor interaction between all of configurable pieces, forcing people using the technology to stick to well-known combinations of options, more or less invalidating the work that went into it.

The programmers then get a huge amount of freedom to utilize the library in any way possible. On this, they can build a lot of code. They also have the freedom to do things incorrectly or inconsistently. Freedom is a great idea until it becomes anarchy.

My goals for the this latest system are pretty much the opposite. I don't want to provide a huge number of small lower-level primitives to build with, I want to restrict the programmers where and however possible. They need just enough flexibility to get the work done easily, but not enough to make it inconsistent. The system itself should constrain them enough to enforce it's own consistency. I don't want open, I want simple and consistent, but I'll get back to that in a bit ...

We're taught to separate out the presentation layer from the model, but do we really understand what that means?

At some lower layer the system holds a huge amount of data that it is manipulating. Lots of the data is similar in many different ways. Ultimately we want to preserve consistency across the system when displaying the different types. In that sense, if we create a 'type' hierarchy to arrange the data, then we can choose reasonable consistent ways to display data whenever it is similar. The username data should always appear the same wherever it is located, so should groupname data.

The data in any system comes in the shape of a fixed series of regular types. The type and structure are almost always finite. Dynamic definitions are possible, but underneath the atomic pieces themselves are static. Everything in a computer is founded on discrete information. We need that in order to compile or run the code.

To these types we want to attach some presentation aspects. Username, for example, is the type. When reading or writing this data, the widgets, colors, fonts, etc. we use on the screen are the actual presentation. It's the way we ask for and display this data.

If the same data appears on twenty different screens, then it should be displayed the same on each screen. Of course data clumps together and different screens present different sub-contexts of the data, but almost all screens are composed of the same finite number of data elements. They switch around a lot but most systems really don't handle a massive number of different types of data, even if they do handle a massive amount of data. The two are very different.

If we create a model of the data, using a reasonable number of types, then for "presentation" all we have to do is define the specific clumps of data that appear together and then bind them to some presentation aspects.

As we are presenting the data, we don't need to know anything about it other than it's type. In a sense, all screens in every system can be completely generalized, except for the little bit of information (name, type and value) that get moved around.

Getting back to my goals, after having written a huge number of systems it has become obvious that for most, a large number of the screens in the system are simply there to create, edit or display the basic data.

The real "meat" of most systems is usually way less than 20% of the screens or functionality. The other 80% just need to be there to make the whole package complete. If you can create data, you need to edit it. If it's in the system you need to display it. All systems need a huge number of low frequency screens that support it, but are not really part of the core. Work that is time-consuming, but not particularly effective.

We can boil down a basic system into a lot of functionality, yet most of it revolves around simple screens that manipulate the data in the system.

One problem with growing systems is that once the functionality -- which started with a few hundred different features -- grows enough, the static hard-coded-ness of the screens becomes a problem. As the functionality grows into thousands of features, the programmers don't have the time or energy to re-organize the navigation to get at it. Thus, virtually all major systems get more and more features, but become more and more esoteric in their placement. And it's a huge and expensive risk to interfere with that degeneration. A gamble lost by many companies over the years.

For all programs, the difference between a few features and a few thousand is a completely different interface. Yet that work can't happen because it's too large. Thus the code-base becomes a disaster. Everybody tip-toes around the problems. And it just gets worse.

So, most of the functionality, in most systems is simple, boring and static. And it's organization/navigation represents an huge upcoming problem (technical debt) for the development, whether or not anyone has realized it. What to do?

Clearly we want to strip away absolutely everything unnecessary and while it's still static, make the remainder as "thin" as possible. We want just the the bare essence of the program in it's own specification.

To do this we can see all interactive programs as just being some series of screens, whether static or dynamic. There are always very complex ways to navigate from one location in the system to another. Thus an overly simplified model for most systems is for some type of navigation of some type of discrete (finite) screens. A screen might contain dynamic data, but the screen itself has a pre-determined layout so that it is consistent with the rest of the system.

Now, we know from the past, that any overly simplified model, like the above simply won't work. A great number of 4GLs where written in the 90s and then discarded trying to solve this exact problem. One of our most famous texts tells us explicitly that there is no such thing as a "silver bullet".

But I'd surmise that these early attempts failed because they wanted to solve 100% of the issue. They were looking for a one-size-fits-all solution. For my work, I'm only really interested in a fixed arrangement for the 80% of screens that are redundantly necessary, yet boring. The other 20% are SEP, as Douglas Adams pointed out in one of the Hitchhiker's Guide to the Galaxy books: Somebody Else's Problem.

ARCHITECTURE

In the definition of the navigation, and the definition of the screens, I really want to specify the absolute minimum "stuff" to map some types onto sets of data within screens.

Navigation is easy, it's really just various sets of locations in the system, and ways of triggering how to get there. Some are absolute, some are relative to the current location.

The sets of data are also easy, we can choose a form-based model, although being very generous so the forms can do all sorts of magical things. Well beyond a simple static one.

For flexibility we might have several 'compacted' screens, basically several simple main-frame like screens all appearing together to ease navigational issues. Web apps deal with condensed navigation well -- originally it was because of slow access -- so it's wise to make use of that where ever possible.

The forms simply need to bind some 'model' data to a specific presentation 'type'. A form needs be interactive, so it could just be a big display of some data. With this model in mind, all of the above is a purely presentational aspect.

Why not choose MVC like everyone else? Long time ago I built a really nice thick client MVC system, where I had a couple of models and the ability for the user to open up lots of different windows with different views on the same underlying models. If they change one aspect of a model in one window, it was automatically updated in another. That really nice, yet complicated trick is the strength of MVC, but that's not necessary in a web application. Since there is only on view associated with the underlying model, using a fancy auto-updating pattern is far too complex and far too much overkill when the goal is to simplify.

Still, although I wasn't headed for MVC, separating out the presentation, meant having a model anyways. And in my system, the presentation layer was in the client side of the architecture, and the model ended up in the server side. It fit naturally within the GWT client/server architecture.

The client become 100% responsible for presentation, while the server is only interested in creating a simplified model of the data. The client is the GUI, the server is the functionality.

Popular philosophy often tries to convince programmers that the system's internal model and the one in the database should be one great "unified" view of the data. That's crazy since, particularly with an RDBMS, the data can be shared across many applications. That's part of the strength of the technology. In that sense, the database should contain all of the data in some domain-specific 'universal' view. That is, the data should be in it's most natural sense relative to the domain from which it was gathered. For example someone pointed out that banks don't internally have things called saving accounts, just transaction logs. "Saving Accounts" are the client perspective on how their money exists in the bank, not the actual physical representation. A bank's central schema might then only contain the transactions.

On the other hand, all applications -- to be useful -- have some specific user-context that they are working in. The data in the application is some twisted specific subset of the universal data. Possibly with lots of state and transformations applied. It's context specific. As such, the application's internal model should most closely represent that view, not the universal one. An application to create and edit savings accounts, should have the concept of a saving account.

Why put universal logic into a specific application or specific data into a universal database? All applications have two competing models for their data. Lots of programmers fail to recognize this and spend a lot of time flip-flopping between them, a costly mistake.

Getting back to the architecture, the application specific model sits in the server side, while the presentation side is in the client.

MORE CHOICES

Another popular dictate is too not write a framework. But given my earlier definitions, an application can (and should) have a large number of smaller frameworks depending on the type of functionality it supports. Writing smaller frameworks is inevitable, in the same way that writing custom libraries is as well. It's just good form to build the system as reusable components.

My choice in building the form mechanism was between creating it as a library for everyone to use, or creating it as a framework and allowing some plugin functionality.

While the library idea is open and more flexible, my goals are to not allow the coders to go wild. Open and flexible go against the grain. I simply want the programmers to specify the absolute minimum for the navigation, screens and data subsets. As little as possible, so it can be refactored easily, and frequently.

A framework on the other hand is less code, and it can be easily controlled. The complex behaviors in the forms, as the system interacts, can be asked for by the programmers, but it's the framework itself that implements them. Yes, that is a huge restriction in that the programmers cannot build what the framework doesn't allow, but keep in mind that this restriction should only apply to the big redundant stuff. The harder 20% or less, will go directly to the screen and be really really ugly. It's just that it will also be unique, thus eliminating repeated code and data.

If the code is boiled down into nothing but its essence, it should be quick to re-arrange it. Not being repeated at all is the definition I put forth for six normal form. The highest state of programming consistency possible. Which means it's intrinsically consistent, the computer is enforcing the consistency of the screens, and where it's not, it's simply a binding to a type that can be easily changed.

The form definitions themselves are interesting too. Another popular design choice is to make everything declarative in an external language/format like XML. Frameworks like Struts have done this to a considerable degree, pushing huge amounts of the structural essence of the code, out of the code and into some secondary format.

Initially I was OK with these ideas, but overtime, in large systems they start to cause a huge amount of distributed information complexity.

A noble goal of development, known as Don't Repeat Yourself (DRY), is founded around the idea that redundancies should be eliminated because duplicate data falls out of sync. Often however, we have to have the same data, just in a couple of different formats, making it impossible to eliminate. If we can't get ride of duplicated yet related data, we certainly can work very hard to bring all of the related elements together in the same location. This type of encapsulation is critical.

The stripped out declarative ideas do the exact opposite. They distribute various related elements over a large number of configuration files, making it hard to get the full picture on what is happening. Object Oriented design is also guilty of this to some degree, but where it really counts, experience programmers will violate any other principles, if it helps to significantly increase the localization and encapsulation.

Virtually any set of ideas that center around breaking things down into small components and filing them away, start to fail spectacularly as the numbers of components rise. We can only cope with a small number of individual pieces, after which the inter-dependencies between those pieces scales up the complexity exponentially. Ten pieces might be fine, but a few hundred is a disaster. It's a common, yet often overlooked issue that crops up again and again with our technologies.

Getting back to the forms, my goal was that the representation be static, simple and completely encapsulating. The form is different from the data, but they both are simple finite descriptions of things. All of the forms and the data in the system can be reduced to textural representations. In this case I picked JSON for the format, because it was so much smaller than XML, doesn't always require the naming of the elements, and because JSON was easily convertible to JavaScript, where the client and the form framework are located.

SERVER SIDE AND BACK

In this design, the programmers have little control over the presentation, thus forcing them to use it consistently. On the back-end, the model is still reasonably flexible, however it is usually bound to a relational database.

The schema and any relatively discrete application-context transformation away from that schema have limited expressibility. You can get past that by giving the application context more state, but that's not the best approach to being simple, although sometimes it can't be helped.

Still, all you need to do is pass along some functionality indicator, assemble data from the database in the proper model, and then flatten it for transport to the front-end somehow. The expressibility of the front is actually flattened by definition anyways, so it's possible to put all of the data into some discrete flat container stored in JSON and pass it to the presentation layer. All that's needed is a simple way to bind the back-end data to some front-end presentation. A simple ASCII string for each name is sufficient.

Incoming data goes into a form, and then interacts with the user in some way. Most systems involved modifying data, so a reverse trip is necessary, as the data goes out from the form, over to the back-end and then is used to update the model. From there it is persisted back into the database.

The whole loop, database -> front-end -> database consists of some simple discrete and easily explainable transformations. Better described as

universal-model->app-model->container->presentation->container->app-model->universal-model

It's the type of thing that can be easily charted with a few ER diagrams and some rough sketches of the screens. The system needn't be any more complex than that.

Getting back to the form mechanics. The framework simply needs to accept a form definition, add some data and then do its thing. From time to time, there may be a need to call out and do some extra functionality, such as fetch more data, sort something or synchronize values between different form elements. But these hooks are small and tightly controlled.

As some point, either at the end or based on some set of events, the form gives control back to the screen, chucking up some data with it. Allowing the code to tie back to the database or move to some other point in the navigation.

The mechanism is simple. The programmers give the framework data, and at some point, when it's ready, they can get it back again. Until then, it's the framework's problem. They can now spend their time and energy working on more complex problems.

Aside from controlling the consistency, the design also helps with testing. With an inverted form library design there is a lot of flexibility, but lots to go wrong. With a black-box style form framework, the internal pathways are well-used and thus well-tested. The system is intrinsically consistent. It's utilizing the computer itself to keep the programmer from making a mess.

SUMMARY

There is more; generalizing this type of architecture involves a larger number of trade offs. Every choice we make in design is implicitly a trade off of some type.

If you boil down the domain problems, you most often find that the largest bulk of them are just about gathering together big data sets.

Mostly the applications to do this are well understood, even if we choose to make them technically more complex. Still, while writing a multi-million line application may seem impressive, it's bound to accumulate so much technical debt that it's future will become increasingly unstable.

The only way around this is to distinguish between what we have to say to implement some specific functionality and what is just extra noise added on top by one specific technology or another. In truth, many systems are more technologically complex then they are domain complex. That is a shame because it is almost never necessary. Even when it is, a great deal of the time, the technical complexity can be encapsulated away from the domain complexity. We can and should keep the two very separate.

Even after such a long discussion, there will still be a few people unconvinced by my explanation. Sure of my madness. The tendency to build things as libraries, because that's the way it's always been done, and because frameworks are currently considered verboten will be too strong for many programmers to resist.

We crave freedom, then strictly adhere to a subset of subjective principles, which is not always the best or even a rational choice. Getting the simplicity of the code under control so extensions to the system aren't inconsistent or tedious is a far more important goal than just about any other in programming. If it means breaking some nice sounding dogma, to get simple and expandable, then it is worth it. Code should be no more complex than it must be. We know this, but few really understand its true meaning. Or how hard it is to achieve in practice.

Developers shouldn't be afraid to build their own frameworks or libraries, or any other code if the underlying functionality is core to their system. That doesn't mean it should be done absolutely from scratch -- there is no sense in rediscovering anew all of the old problems again -- but if one can improve upon existing code, partially after reading and understanding it, then it's more than just a noble effort to go there. Dependencies always cause problems, waste time. It's just whether that effort in the long run is more or less than the design, development and testing of a new piece of code. If you really know what you are doing, then a lot more comes within your reach.