Wednesday, May 18, 2022

Twisted

The problem most programmers have with error handling is that they tend to think of it as a secondary aspect of their code. They write the code, then they add in the error handling later.

But what if you flip it?

That is, you write the code to collect errors first, and the ‘error condition’ is that the code accidentally worked.

For example, say you are writing a user GUI form. You know that there will be lots of things wrong with the data in the form. So, you set up the form such that each widget has a space for an error message. That it is actually 2 widgets each time, but sometimes that error text is invisible. Then all you need to do is call some overall form validation that will give you the list of these messages that you conditionally stick into the form. Some widgets won’t have errors, but you write the code to assume that most do.

Then inside of the underlying form validation, you iterate through each widget’s input and feed those values to individual validation routines based on some pre-setup form field criteria. If they give you something good, you add it back to the list. If not, that is okay too.

Encoded that way, the logic for handling complex multi-errors in a form is quite simple.

The same is true for any distributed coding. Let's say you have to call a REST API. If you start by assuming that it will fail, then you obviously want to get back a full and detailed set of error information. It can fail for all sorts of reasons, it helps later if you know which specific one caused the fault. That becomes the default behavior.

But sometimes it makes sense, given a few types of the possible errors, that you can just wait and do it again a few seconds later to see if it will work now. You can bury that just below the surface. So, now you have this call that will give you back a detailed error, but occasionally it will wait and retry if it thinks the error might be temporary. If the retries fail, it returns a details error for each attempt.

Then all you need is to encode the ‘error response’ which is that the call worked and it gave you back some clump of data. So in an odd sense, the error is something like OK or Success. After that, you take that clump, and ‘process’ it somehow, then pass that back up the stack.

While this works extremely well for error handling, there are lots of different cases in coding where if you flip your perspective, the code gets simplified. Sometimes even adding fake or ‘virtual’ things works as well, and will really help to clean up the logic. Besides cleaner code, you’ll also find less bugs and the added bonus of being able to visually verify that the code will do what you expect, so you are not as reliant on strong testing.

Saturday, May 7, 2022

Encapsulation vs Fragmentation

A technology is encapsulated if all of the things you need to do in order to customize it are contained internally. On the other hand, it is fragmented if many of the things you need to do to customize it are external. Customizations can include adding or changing config, data, or code.

That is, are all the changes in one place, or are they scattered around everywhere?

Examples:

Unix shell programming is encapsulated. In the shell language bash, for example, any command you are going to execute is available in the account you’ve logged into. If it’s not there, then you can’t use it. Most of the commands are in /bin, or /user/bin, or /local/bin, so there is a small, finite set of locations to check to see if you can use a command or not.

You can read a script to figure out what it is doing. Most things are explicit, or at least explicit enough if you understand the design philosophy, that you don’t need to jump around. Small, fairly simple commands, that are well documented with man pages. A script can include other scripts, but it is straightforward to find and investigate them.

It’s still somewhat encapsulated if you have an arrangement like an ‘app store’, where the possibilities are external, but having picked an app, it is ‘copied’ locally. It can be subverted though, by the app refusing to run if it’s not upgraded, that external dependency is not local.

A browser is intrinsically fragmented. The websites are scattered all over the planet, and they can change dynamically. While it is super convenient, it is not reliable, and the state of the website is disconnected from the state of your computer.

A code repo that only references necessary its dependencies is fragmented. If it is loose with version numbers, then any given instance of it can even produce different binaries or runtime behavior.

A lot of the conventions in Javascript are also highly fragmented. It seems to be a core philosophy of the stack, and part of why programmers either really love it or completely hate it. Underneath, it really is a tradeoff between development speed and readability. Programmers can throw together large web apps, but it’s difficult to know if they’ll handle all of the user interactions correctly. So it's common to stumble into undefined behaviors in these apps quite often, which makes proper testing and security a lot more expensive.

Declarative programming can be fragmented. You might have to keep bouncing between configuration files and function calls, in order to figure out how it works. Likewise, dependency injections can be fragmented as well, although if they are set up correctly as dynamic polymorphism they might be ok.

One of the key problems with fragmentation is that as the code gets more sophisticated it quickly becomes impossible to assert the underlying behavior. It is a rather implicit form of spaghetti. It’s scrambled, but indirectly. Instead of bouncing all over the place in the call tree, you bounce all over the place in the codebase. But you are still bouncing all over the place. As such it does not scale well.

For encapsulated work, you can examine the parts and assess whether it will or will not behave in a certain way. Those ‘clear’ and ‘understandable’ qualities are a core part of readability. You should be able to look at any piece of code and at least understand what it is trying to do.

Some confusion may come from layering. A complex system can be built up from many smaller, encapsulated parts. Those parts themselves may have been built up from underlying parts. If the encapsulations are clear and obvious, you don’t need to go into them in order to understand their behavior. You just need to make sure that the layer does what it claims.

People seem to initially like fragmentation, probably because it places very few restrictions on where you can toss stuff. But that freedom, particularly when used erratically, is exactly why it is a problem. A group of developers might desire the freedom to initially set up their work in a specific way, but don’t confuse that with individual programmers later having the freedom to do things in whatever way they feel like today. Once the project has been set up, that original freedom is gone. Everyone else working on the project needs to follow suit. Programming takes discipline, constantly making a mess of stuff is a really bad habit to get into.

In theory, you can document the fragments, but documentation is always the first casualty of time, so it is rarely done. And often when it is, it is badly done. The big problem is that people confuse fragmentation with small encapsulated pieces, they are not the same thing. Location is part of it, but also consistency, philosophy, and context. Unix shells, for example, aren’t fragmented but they are small reusable pieces. Encapsulation removes the need to dig around inside. Fragmentation does not.