Saturday, August 3, 2013

Time and Shortcuts

I haven't posted anything for a while now. This has been the longest gap I've had since I started blogging five years ago. It's not that I don't have any anything to say -- my hard drive is littered with half-finished posts -- but rather that I just haven't had the time or energy to get my thoughts down nicely.

Lately I've been wrapped up in a large complicated system that has a wide array of problems, none of which are new to me but it's unusal to see them all together in the same project. I like this because it is extremely challenging, but I definitely prefer projects to be well-organized and focus on engineering problems, not organizational ones. 

My sense over the last couple of decades is that software has shifted from being an intense search for well-refined answers to just a high-stress fight against allowing all of the earlier shortcuts to swamp the project. I find that unfortunate because the part of the job that I've always loved was that satisfaction from building something good. Just whipping together a barely functoning mess, I find depressing.
 
What I've noticed over the years is that there is a whole lot more progress made when the code is well-thought out and clean at all levels (from syntax to the interface). The increasingly rare elegant solution takes it one step higher. You get so much more out of the work, since there are so many more places to leverage it. But of course spending the time to get it right means coming out of the gate that much slower, and it seems that people are increasingly impaitent these days. They just want a quick bandaid whether or not that will make the probem worse. 

Getting a mess back under control means having to say 'no' often. It's not a popular word and saying it comes with a lot of angst. It does not help that there are so many silver bullet approaches floating about out there. Anyone with little software knowledge is easily fooled by the over abundance of snake charmers in our industry. It's easy to promise that the work will be fast and simple, but experience teaches us that the only way to get it done is hard work and a lot of deep thinking. Writing code isn't that hard, you can teach it to high school students rapidly, but writing industrial strength code is a completely different problem.

I'd love to take off some more time and write another book that just lists out the non-controversial best practices that we've learned over the last few decades -- a software 101 primer -- but given that my last effort sold a massive 56 copies and I'm a wage slave it's not very likely that I'll get a chance to do this anytime soon. The trick I think is to shy away from the pop philosophies and stick to what we know actually works. Software development is easy to talk about, easy to thoerize about, but what often really works in practice is counter-intuitive for people with little experience. That's not unusal for complex systems and a large development projects contains millions of moving parts (people, code, technology, requirements, data, etc) with odd shifting dependencies. You can't understand how to organize such a volitile set of relationships without first devling deep into actual experience and even then it's hard to structure the understanding and communicate it. 

What flows around the production of the code is always more complex than the code itself and highly influenced by its environment. A good process helps the work progress as rapidly as possible with high quality results, most modern methodologies don't do that. That's one of our industries most embarrassing secrets and it seems to be only getting worse.

Hopefully one of these days I'll catch my breath and get some of my half-finished posts completed. There are some good lessons learned buried in those posts, it just takes time and patience to convert them into some shareable.

Sunday, June 16, 2013

Relationships

“Everything is relative in this world, where change alone endures.”

A huge problem in software development is to create static, rigid models of a world constantly in flux. It’s easy to capture some of the relationships, but getting them all correct is an impossible task.
Often, in the rush, people hold the model constant and then overload parts of it to handle the change. Those types of hacks usually end badly. Screwed up data is computer can often be worse than no data. It can take longer to fix the problem then it would to just start over. But of course if you do that, all of the history is lost.
One way to handle the changing world is to make the meta-relationships dynamic. Binding the rules to the data gets pushed upward towards the users, they become responsible for enhancing the model. The abstractions to do this are complex, and it always takes longer to build than just belting out the static connections, but it is often worth adding this type of flexibility directly into the system. There are plenty of well-known examples such as DSLs, dynamic forms and generic databases. Technologies such as NoSQL and ORMs support this direction. Dynamic systems (not to be confused with the mathematical ‘dynamic programming’) open up the functionality to allow the users to extend it as the world turns. Scope creep ceases to be a problem for the developers, it becomes standard practice for the users.
Abstracting a model to accommodate reality without just letting all of the constraints run free is tricky. All data could be stored as unordered variable strings for instance, but the total lack of structure renders the data useless. There needs to be categorization and relationships to add value, but they need to exist at a higher level. The trick I’ve found over the years is to start very statically. For all domains there are well-known nouns and verbs that just don’t change. These form the basic pieces. Structurally as you model these pieces, the same type of meta-structures reappear often. We know for example that information can be decomposed into relational tables and linked together. We know that information can also be decomposed into data-structures (lists, trees, graphs, etc) and linked together. A model gets construction on these types of primitives, whose associations form patterns. If multiple specific models share the same structure, they can usually be combined, and with a little careful thought, named properly. Thus all of the different types of lists can just one set of lists, all of the trees can come together, etc. This lifts up the relationships by structural similarity into a considerable smaller set of common relationships. This generic set of models can then be tested against the known or expected corner-cases to see how flexible it will be. In this practice, ambiguity and scope changes just get built directly into the model. They become expected.
Often when enhancing the dynamic capabilities of a system there are critics who complain of over-engineering. Sometimes that is a valid issue, but only if the underlying model is undeniably static. There is a difference between ‘extreme’ and ‘impossible’ corner-cases, building for impossible is a waste of energy. Often times though, the general idea of abstraction and dynamic systems just scares people. They have trouble ‘seeing it’, so they assume it won’t work. From a development point of view that’s where encapsulation becomes really important. Abstractions need to be tightly wrapped in a black-box. From the outside the boxes are as static as any other piece of the system. This opens up the development to allow a wide range of people to work on the code, while still leveraging a sophisticated dynamic behavior.
I’ve often wondered about how abstract a system could go before it’s performance was completely degraded. There is a classic tradeoff involved. A generic schema in an RDBMS for example will ultimately have slower queries than a static 4th NF schema, and a slightly denormalized schema will perform even better. Still, in a big system, is losing a little bit of performance an acceptable cost for not having to wait for 4 months for a predictable code change to get done? I’ve always found it reasonable.
But it is possible to go way too far and cause massive performance problems. Generic relationships wash out the specifics and drive the code to being in NP-complete or worse. You can model any and everything with a graph, but the time to extract out the specifics is deadly and climbs at least exponentially with increases in scale. A fully generic model of everything just being a relationship between everything else is possible, but rather impractical at the moment. Somewhere down the line, some relationships have to be held static in order for the system to perform. Less is better, but some are always necessary.
Changing relationships between digital symbols mapped back to reality is the basis of all software development. These can be modeled with higher level primitives and merged together to avoid redundancies and cope with expected changes. These models drive the heart of our software systems, they are the food for the algorithmic functionality that helps users solve their problems. Cracks in these foundations propagate across the system and eventually disrupt the user’s ability to complete their tasks. From this perspective, a system is only as strong as its models of reality. It’s only as flexible as they allow. Compromise these relationships and all you get is unmanageable and unnecessary complexity that invalidates the usefulness of the system. Get them right and the rest is easy. 

Saturday, June 1, 2013

Process

A little process goes a long way. Process is, after all, just a manifestation of organization. It lays out an approach to some accomplishment as a breakdown of its parts. For simple goals the path may be obvious, but for highly complex things the process guides people through the confusion and keeps them from missing important aspects.
Without any process there is just disorganization. Things get done, but much is ignored or forgotten. This anti-work usually causes big problems and these feed back into the mix preventing more work from getting accomplished. A cycle ensues, which among other problems generally affects morale, since many people start sensing how historic problems are continuously repeating themselves. Things either swing entirely out of control, or wise leadership steps in with some "process" to restore the balance.
Experience with the chaotic none-process can often lead people to believe that any and all processes are a good thing. But the effectiveness of process is essentially a bell curve. On the left, with no process, the resulting work accomplished is low. As more process is added, the results get better. But there is a maximal point. A point at which the process has done all that it can, after which the results start falling again. A huge over-the-top process can easily send the results right back to where they started. So too much process is a bad thing. Often a very bad thing.
Since the intent of having a process is to apply organization to an effort, a badly thought out process defeats this goal. At its extreme, a random process for example, it is just formalized disorganization. Most bad processes are not truly random but they can be overlapping, contradictory or even have huge gaps in what they cover. These problems all help reduce the effectiveness. Enough of them can drive the results closer to being random.
Since a process is keyed to a particular set of activities or inquires, it needs to take the underlying reality into account. To do this it should be drafted from a 'bottom-up' perspective. Top-down process rules are highly unlikely to be effective primarily because they are drafted from an over-simplification of the details. This causes a mismatch between the rules and the work, enhancing the disorganization rather than fixing it.
Often bad process survives, even thrives, because its originators incorrectly claim success. A defective software development process, for instance, may appear to be reducing the overall number of bugs reaching the users, but the driving cause of the decreases might just be the throttling of the development effort. Less work gets done, thus there are less bugs created, but there is also a greater chance for upper management to claim a false victory.
It's very easy to add complexity to an existing process. It can be impossible to remove it later. As such, an overly complex process is unlikely to improve. It just gets stuck into place becoming an incentive for any good employees to leave, and then continues to stagnate over time. This can go on for decades. Thus arguing for the suitability of a process based on the fact that its been around for a long time is invalid. All it shows is that it is somewhat better than random, not that is is good or particularly useful in any way.
Bad process leaves around a lot of evidence lying around that it is bad. Often the amount of work getting accomplished is pitifully low, while the amount of useless make-work is huge. Sometimes the people stuck in the process are forced to bend the truth just to get anything done. They get caught between getting fired for getting nothing done or lying to get beyond the artificial obstacles. The division between the real work and its phantom variant required by the process manifest into a negative conflict-based culture.
For software, picking a good process is crucial. Unfortunately the currently available choices out there in the industry are all seriously lacking in their design. From experience the great processes have all been carefully homegrown and driven directly by the people most affected by them. The key has been promoting a good engineering culture that has essentially self-organized. This type of evolution has been orders of magnitude more successful than going out and hiring a bunch of management consults who slap on a pre-canned methodology and start tweaking it.
That being said, there have also been some horrific homegrown processes constructed that revel in stupid make-work and creatively kill off the ability to get anything done. Pretty much any process created by someone unqualified to do so is going to work badly. It takes a massive amount of direct experience with doing something over and over again before one can correctly take a step back and abstract out the qualities that make it successful. And abstraction itself is a difficult and rare skill, so just putting in the 10,000+ hours doesn't mean someone is qualified to organize the effort.
Picking a bad process and sticking to is nearly the same as having no process. They converge on the same level of ineffectiveness.

Monday, May 20, 2013

Death by Code

A mistake I've commonly seen in software development is for many programmers to believe that things would improve on a project, if they only had more code.
It's natural I guess, as we initially start by learning how to write loops and functions. From there we move onto to being able to structure larger pieces like objects. This gradual broadening of our perspective continues, as we take on modules, architectures and eventually whole products. The scope of our understanding is growing, but so far its all been contained within a technical perspective. So, why not see the code as the most important aspect?
But not all code is the same. Not all code is useful. Just because it works on a 'good' day doesn't mean that it should be used. Code can be fragile and unstable, requiring significant intervention by humans on a regular basis. Good code not only does the right thing when all is in order, but it also anticipates the infrequent problems and handles them gracefully. The design of the error handling is as critical (if not more) than the primary algorithms themselves. A well-built system should require almost no intervention.
Some code is written to purposely rely on humans. Sometimes it is necessary -- computers aren't intelligent -- but often it is either ignorance, laziness or a sly form of job security. A crude form of some well-known algorithms or best practices can take far less time to develop, but it's not something you want to rely on. After decades we have a great deal of knowledge about how to do things properly, utilizing this experience is necessary to build in reliability.
Some problems are just too complex to be built correctly. Mapping the real world back to a rigid set of formal mechanics is bound to involve many unpleasant trade-offs. Solving these types of problems is definitely state-of-the-art, but there are fewer of these out there than most programmers realize. Too often coders assume that their lack of knowledge equates to exploring new challenges, but that's actually quite rare these days. Most of what is being written right now has been written multiple times in the past in a wide variety of different technologies, It's actually very hard to find real untouched ground. Building on past knowledge hugely improves the quality of the system and it takes far less time, since the simple mistakes are quickly avoided.
So not all code is good code. Just because someone spent the time to write it doesn't mean that it should be deployed. What a complex system needs isn't more code, but usually less code that is actually industrial strength. Readable code that is well-thought out and written with an strong understanding of how it will interact with the world around it. Code that runs fast, but also is defensive enough to make problems easier to diagnose. Code that fits nicely together into a coherent system, with some consistent form of organization. Projects can always use more industrial strength code -- few have enough -- but that code is rare and takes time to develop properly. Anything else is just more "code".

Sunday, April 21, 2013

Monitoring

The primary usage of software is collecting data. As it is collected, it gets used to automate activities directly for the users. A secondary effect from this collection is the ability to monitor how these activities are progressing. That is, if you've build a central system for document creation and dissemination, you also get the ability to find out who's creating these documents and more importantly how much time they are spending on this effort.
Monitoring the effectiveness of some ongoing work allows for it to be analyzed and improved, but it is a nasty double-edged sword. The same information can be used incorrectly to pressure the users into performing artificial speedups, forcing them to do unpaid work or to degenerate their quality of effort. In this modern age it isn't unusual for some overly ambitious upper-management to demand outrageous numbers like 150% effort from their staff. In the hands of someone dangerous, monitoring information is a strong tool for abuse.They do this to get significant short-term gains but these come at the expense of inflicting long-term damage. They don’t care, they're usually savvy enough to move on to their next gig long before that debt actually becomes a crisis.
Because of its dual nature, monitoring the flow of work through a big system is both useful but also difficult. It is done well when it gets collected, but is limited in its availability. Software that rats out its users is not appreciated, but feedback to help improve the working effectiveness is. One way of achieving this latter goal is to collect fine-grained information about all of the activities, but only make it available as generalized anonymous statistics. That is, you might know the minimum and maximum times people spend on particular activities, but all management can see is the average and perhaps the standard deviations. No interface exists for them to pull up the info on a specific user, so they can’t pressure or punish them.
Interestly enough, when collecting requirements for systems, fine-grained monitoring often shows up. Not only that, but there is usually some 'nice sounding' justification for having it. Most of software development these days is oriented to giving the ‘stakeholders’ exactly what they want, or even what they ask for, but this is one of those areas where professional software developers shouldn't bow directly to the pressure. It takes some contemplation, but a good developer should always empathize with their users -- all of them -- and not build anything that they wouldn't like applied to themselves. After all, would you really be happy at work if you had to do something demeaning like punch a timecard in and out? If you don't like it why would anyone else?