Monday, November 5, 2007

The Art of Encapsulation

In all software projects we quickly find ourselves awash in an endless sea of annoying little details.

Complexity run amok is the leading cause of project drownings; controlling it is vital. If you become swamped, it usually gets worse before you can manage to get it under control. It starts as a downwards spiral; picking up speed as it grows. If it is not explicitly brought under control, it swallows the entire project.

The strongest concept we have to control complexity is "encapsulation". The strength behind this idea is the ability to isolate chunks of complexity and essentially make them disappear. Well, to be fair they are not really gone, just buried in a black box somewhere; safely encapsulated away from everything else. In this way we can scale down the problem from being a monstrous wave about to capsize us, into a manageable series of smaller waves that can be easily surmounted.

Most of us have been up against our own personal threshold for keeping track of the project details, beyond which things become to complex to understand. We known that increasing the size of the team can push up the ability to handle bigger problems, but it also increases the amount of overhead. Big teams require bigger management, with each new member becoming increasingly less effective. Without a reasonable means to partition the overall complexity, the management cross-talk for any significant sized project would consume the majority of the resources. Progress quickly sinks below the waves. Just adding more bodies is not the answer, we need something stronger.

Something that is really encapsulated, hides the details from the outside world. By definition, it has become simpler. All of the little details are on the inside, and none of them need to be known on the outside to be effective. In this way, that part of the problem has been solved and is really to go. You are free to concentrate on the other unresolved issues, hopefully removing them one-by-one until the project comes down to just a set of controlled work that can be completed. At each stage, the complexity needs to be found, dealt with and contained.

The complexities of design, implementation and operation are very different from each other. I've seen a few large projects that have managed to contain the implementation complexity, only to lose it all by allowing the operational complexity to get out of hand. Each part in the process requires its own understanding to control its own special complexity. Each part has its own way of encapsulating the details.

Successful projects get there because at every step, the effort and details are well understood. It is not that they are simpler, just that they are compartmentalized enough that changes do not cause unexpected complexity growth. If you are, for example adding some management capabilities to an existing system, the parts should be encapsulated enough that the new code does not cause a significant number of problems with the original code, but it should also be tied closely enough to it too leverage its capabilities and its overall consistency. Extending the system should be minimal work that builds on the existing code base. This is actually easy to do if the components and layers in the system are properly encapsulated; it is a guaranteed side effect of a good architecture. It is also less work and higher quality.

Given that, it is distressing how often you see architectures, libraries, frameworks and SDKs that should encapsulate the details, but they do not. There is some cultural aspect to Computer Science where we feel we cannot take away the choices of other programmers. As such, people often build development tools to encapsulate some programming aspects, but then they leave lots of the details free to percolate upwards; violating the whole encapsulation notion. Badly done encapsulation is not encapsulation. If you can see the details, then you haven't hidden them, have you?

The best underlying libraries and tools are ones that provide a simplified abstraction over some complicated domain, hiding all of the underlying details as they go. If we wanted to get down and dirty we wouldn't use the library, we would do it directly. If we choose to use the library, we don't want to have to understand it and the underlying details too. That's twice as much work, when in the end, we don't care.

If you are going to encapsulate something, then you really should encapsulate it. For every stupid little detail that you let escape, the programmers above you should be complaining heavily. If you hide some of the details, but the people above still need to understand them in order to call things properly, then it wasn't done correctly. We cannot build on an infrastructure that is unstable because we haven't learned how it works. If we have to understand it to use it, it is always faster in the long run just to do it ourselves. So ultimately building on a bad foundation is just wasting our time.

It is possible to properly encapsulate the details. It is something we need to do if we want to build better software. There is no need to watch your projects slip below the churning seas. It is avoidable.