A property of spaghetti code is that it is highly fragmented and distributed all over the place. So, when you have to go in and see what it is doing, you end up getting lost in a ‘maze of twisty passages’.
Code gets that ugly in two different ways. The first is that the original base code was cloudy. That is, the programmer couldn’t get a clear view of how it should work, so they encoded the parts oddly and then mashed them all together and got them to work, somehow. The second is that the original code might have been nice, tight, and organized, but a long series of successive changes over time, each one just adding a little bit, gradually turned it into spaghetti. Each change seemed innocent enough, but the collective variations from different people, different perspectives, added up into a mess.
The problem with spaghetti is that it is too cognitively demanding to fix correctly. That is, because it is spaghetti there is a higher than normal likelihood that it contains a lot of bugs. In a system that hasn’t been well used, many of these bugs exist but haven’t had been noticed yet.
When a bug does rear its head, someone needs to go into this code and correct it. If the code is clear and readable, they are more likely to make it better. If the code is a mess, and difficult, they are more likely to make it worse but hide those new problems with their bugfix. That is, buggy code begets more bugs, which starts a cycle.
So, obviously, we don’t want that.
To see how to avoid this, we have to start at the end and work backward. That is, we have to imagine being the person who now has to fix a deficiency in the code.
What we really want is for them to not make the code worse, so what we need to do is put all similar aspects of what the code does, in as close proximity to the code as possible. That is, when it is all localized together, it is way easier to read and that means the bug fixer is more likely to do the right thing.
Most computations, in big systems, naturally happen in layers. That is, there is a high-level objective that the code is trying to achieve to satisfy all or part of a feature. That breaks down into a number of different steps, each one of them is often pretty self-contained. Below that, the lower steps generally manipulate data, which can involve getting it from globals, a cache, persistence, the file system, etc. In this simple description, there are at least three different levels, that have three very different focuses.
When we are fetching data, for example, we need some way to find it, then we need to apply some transformations on it to make it usable. That, in itself, is self-contained. We can figure out everything that is needed to get ‘usable’ data from the infrastructure, we don’t have to be concerned with ‘why’ we need the data. There is a very clear ‘line’ between the two.
So, what we want from our layers, is clear lines. Very clear lines. An issue is either 100% on one side, or the other, but not blurred.
We do this because it translates back to stong debugging qualities. That is, in one of the upper layers, all one needs to do is check to see if the data is usable, or not. If it is, the problem is with the higher-level logic. If not, then it is how we have fetched the data that is the real problem. Not only have we triaged the problem, but we’ve also narrowed down to where the fix may need to go.
There have often been discussions amongst programmers about how layers convolute the code. That is true if the layers are arbitrary. There are just a whole bunch of them, that don’t have a reason to exist. But it is also true that at least some of them are necessary. They provide the organizational lines that help localize similar code and encapsulate it from the rest of the system. They are absolutely necessary to avoid spaghetti.
Getting back to the three-level basic structure discussion, it is important that when there is a problem, a programmer can quickly identify that it is located somewhere under that first higher level. So, we like to create some type of ‘attachment’ between what the user has done in the application and that series of steps that will be executed. That’s fundamentally an architectural decision, but again it should be clear and obvious.
That means that in the best-case scenario, an incoming programmer can quickly read the different steps in the high-level, compare them back to the bug report and narrow down the problem. When the code is really good, this can be done quickly by just reading it, ponding the effect, and then understanding why the code and the intent deviated. If the code is a bit messier, the programmer might have to step through it a few times, to isolate the problem and correct it. The latter is obviously more time-intensive, so making the code super readable is a big win at this point.
What happens in a massive system is that one of the steps in a higher-level objective drops down into some technical detail that is itself a higher-level objective for some other functionality. So, the user initiates a data dump of some type, the base information is collected, then the dump begins. Dumping a specific set of data involves some initial mechanics, then the next step is doing the bulk work, followed by whatever closing and cleanup is needed. This lower functionality has its own structure. So there is a top three-layer structure and one of the sub-parts of a step is another embedded three-layer structure. Some fear that that might be confusing.
It’s a valid concern, but really only significant if the lower structure does something that can corrupt the upper one. This is exactly why we discouraged globals, gotos, and other side-effects. If the sub-part is localized, correctly, then again either it works, or it doesn’t. It doesn’t really matter how many levels are there.
In debugging, if the outputs coming back from a function call are sane, you don’t need to descend into the code to figure out how it works, you can ignore that embedded structure. And if that structure is reused all over the system, then you get a pretty good feeling that it is better debugged than the higher-level code. You can often utilize those assumptions to speed up the diagnosis.
So we see that good localization doesn’t just help with debugging and readability, it also plays back into reuse as well. We don’t have to explicitly explore every underlying encapsulation if their only real effect is to produce ‘correct’ output. We don’t need to know how they work, just that the results are as expected. And initially, we can assume the output is correct unless there is evidence in the bug report to the contrary. That then is a property that if the code is good, we can rely on to narrow down the scope to a much smaller set of code. Divide and conquer.
The converse is also true. If the code is not localized, then we can’t make those types of assumptions, and we pretty much have to visit most of the code that could have been triggered. So, its a much longer, and more onerous job to try and narrow down the problem.
A bigger codebase always takes longer to debug. Given that we are human, and that tests can only catch a percentage of bugs that we were expecting, then unless one does a full proof of correctness directly on the final code, there will always be bugs, in every system. And as it grows larger, the bugs will get worse and take longer to find. Because of that, putting in an effort to reduce that problem, pays off heavily. There are many other ways to help, but localization, as an overall property for the code, is one of the most effective ways. If we avoid spaghetti, then our lives get easier.