The Programmer's Paradox: Natural Decompositions

Wednesday, February 7, 2024

Natural Decompositions

Given a large problem, we start by breaking it down into smaller, more manageable pieces. We can then solve all of the smaller problems and combine them back together to solve the original problem.

The hiccup is that not all decompositions are created equal. If you break a big problem down into subparts, when they have any sort of cross dependencies with each other you can’t work on them independently. The dependencies invalidate the decomposition.

So we call any decomposition where all of the subparts are fully independent a ‘natural’ decomposition. It is a natural, complete, hard ‘line’ that completely separates the different parts.

Do natural decompositions actually exist?

Any subpart that has no dependencies on other outside parts is fully encapsulated. It is a black box.

A black box can have an interface. You can put things into the box. It’s just that whatever happens in the box stays in the box. You don’t need to know anything about how the box works inside, just on the outside.

A car engine is a good example. You put in fuel, and you play with the pedals, then the car moves. If you are just driving around, you don’t need to know much more than that. Maybe if you are pushing it on the highway or a racetrack, you’d need to understand gearing, acceleration, or torque better, but to go to the grocery store with an automatic transmission it isn’t necessary.

Cars have fairly good natural decompositions. They are complex machines, but most people don’t really need to understand how they work. Mechanics and race car drivers do.

Software though is much harder to decompose because it isn’t visible. The lines between things can be messed up and awful, but very few people would know this. A five wheeled car/truck/motorbike monstrosity would be quickly discounted in reality, but likely survive as a software component.

Although we don’t see it the same way, we can detect when a decomposition is bad. The most obvious test is that if you have to add a line of code, how many places are there that it would fit reasonably? The answer should be one. If that is not the answer then the lines are blurred somewhere.

And that is the crux. A good decomposition eliminates the degrees of freedom. There is just one place for everything. Then your code is organized if everything is in its one place. It’s simple, yet not simple at all.

For example, If you break off part of the system as a printing subsystem, then any and all code that is specifically tied to printing must be in that subsystem.

Now it’s not to say that there isn’t an interface to the printing subsystem. There is. Handling user context and the specific gui contexts is done elsewhere and must be passed in. But no heavy lifting is ever done outside. Only on the inside. You might have to pass in a print-it-this-way context that directs what is done, but it only directs it from the outside, the ‘doing it’ part is inside the box.

One of the hardest problems in software is getting a group of programmers to agree on defining one place for all of the different types of code and actually putting that code in the one place it belongs.

It fails for two reasons. The first is that it is a huge reduction in freedom. You aren’t free anymore to put the code anywhere. The culture of programming celebrates freedom, even when it makes our lives way harder or even tragic.

The other reason is in making it quick and easy for newer programmers to know where to put stuff. If we fully documented all of those places it would be far too much to read, and if we don’t most people won’t read the code to try to figure it out for themselves. Various standards and code reviews have tried to address it over the decades, but more often than not people just create a mess and pretend like they didn’t. Occasionally you see large projects with good discipline, it happens.

This shows up in other places too. Architecture is the drawing of lines between things. An enterprise architect should draw enough lines in a company to keep it organized; a system architect should draw enough lines in a system for the same effect. Again, these lines need to be natural to be useful. If they are arbitrary they make the problems worse not better.

Decomposition is the workhorse of software development, but it's far too easy to get it wrong. Fortunately it’s not hard to figure out if its wrong and fix it. Things go a lot smoother when the decompositions are natural and the work is organized. Programming is hard enough sometimes, we don’t need to find ways to make it worse.

The Programmer's Paradox

Wednesday, February 7, 2024

Natural Decompositions

No comments:

Post a Comment