Thursday, December 15, 2022

Determinism

When you go to release a piece of software, you really, really, really want to know what that software will do in the wild. That is a big part of the mastery of programming.

People often ask for code that does very specific things. You write exactly that code. Each and every time it runs it will correctly do what is asked. Its behavior in ‘all’ circumstances is what you expect it to be.

There will be bugs, of course.

Depending on your skill level, there is a relatively fixed amount of targeted testing that can achieve any particular desired quality level.

For instance, if you are a strong programmer and you need near-perfect quality, depending on the size of the codebase, the testing might only require a few months or a couple of quarters of intensive effort to confirm that it always works correctly.

Aside from bugs, there are plenty of other things in code that can affect whether or not its execution will match your expectations. We’ll use the word ‘determinism ‘to describe these.

Code is deterministic if it only ever does what you expect. It is non-deterministic if its behavior is unpredictable; you don’t know how it works or what it will do when it runs.

A fully deterministic piece of code takes some inputs, applies a reasonable computation on them, and then produces some outputs. This happens each and every time you run the code. It seems simple enough.

So, the first point in making code deterministic, is that you understand and get control of all of the inputs.

We learned over fifty years ago that global variables were a bad idea. If you look at a function that relies on a global variable then not only do you have to worry about the inputs, but you also have to have some deep understanding of all of the possible states of that global. If it’s a signed integer, for instance, it could be any value between the minimum and maximum integer values.

For one calculation that doesn’t seem that bad, but as the code is following an execution tree for a bunch of functions, you can’t assume that the integer hasn’t changed for some reason. It may have been zero at the start, but somewhere else deep down in the stack it flipped to 42. You don’t know if that is possible, or if it happened just by looking at a small number of those functions. You’d have to look at everything that touches that global, and figure out if it ever got triggered. It is bad for single-threaded processes, it is far worse for multi-threaded code.

This is to say that referencing a global variable in your function adds some non-determinism to the function itself. Maybe a little, maybe a lot. Now, you aren’t entirely sure what to expect anymore when it executes. It’s not a massive problem for a single integer, but for each such occurrence, it only gets worse.

So any globals, whether they are primitive values, objects, singletons, etc. start to add non-determinism to each and every bit of code that utilizes them. They infect the code.

Oddly the fix isn’t hard.

We just make sure that every input in any function is declared as part of the function parameters. Then we know, way at the top, that all of the underlying functions are working on the same instance of that global and that they are referencing the same value, each and every time.

In that sense, “except for the top”, using globally accessible ‘variables’ is degrading the determinism of the code. Constants and immutable globals are okay, it’s just the stuff that varies that is dangerous.

We can further explore this by understanding that there is always a greater context playing out for our code. We can refer to that ‘execution context’ as ‘state’. If the only thing in that state is fixed data, then there is only 1 overall global state, and everything below that is deterministic. If there are N variables, each of which has M1, M2, ... Mq possible values, then the number of possible states is the M1*M2*...*Mq permutations, which is a crazy large number. Too many for us to imagine, too many for us to craft correct expectations about the execution.

But programmers use globals all of the time because it is easier. You just declare stuff once in some file, then keep referencing it everywhere. Oddly, shoving those globals into the function args is only slightly more work, but people still love to cut this corner aggressively.

Along with globals, a long time ago we realized that using ‘goto’ statements caused similar problems. They allow the flow of the code to just jump out to strange and unpredictable places. It’s the same problem with flow, that globals are to state. You can’t look at a function and produce correct expectations; you can not fully “determine” what that code is going to do without a lot of jumping around, searching, and contemplating other parts of the codebase. The flow of control itself is non-deterministic.

It’s not that the goto operations are all bad, after all, they did return to languages in limited circumstances like setjmp/longjmp and later in try/catch statements. Just that reckless usage cranks up the non-determinism, which makes it increasingly impossible to figure out what this stuff does in the wild.

The try/catch idiom for example does sometimes lightly re-introduce the goto problem. Any usage tends to split the outbound flow from the code. If there is a stack of code, each of which is doing one or more try/catch handling, then each one adds a *2 possible branch to the way the code will execute. A couple of *2 branches are easy to understand, but throwing lots of them everywhere all over the execution stack will explode the permutations beyond anyone’s ability to imagine the behavior, and thus make everything non-deterministic again.

That’s why the best practice is to always restrict try/catch handling to be only right at the bottom or right at the top. Put it in the bottom if you can do something meaningful right there, otherwise, let it go right up to the top and get dealt with just once. If you use try/catch diligently it cuts down on your own flow mechanics, but if you don’t apply any discipline the whole thing becomes increasingly non-deterministic.

As well as globals and gotos, non-determinism can be introduced at the computational and architectural levels too. The problem space itself might hold some non-deterministic or ambiguous elements, which can’t be tackled by brute force. Examples of these types of problems are quite complex, but in general, we tend to capture them as issues like transactional integrity, the dirty write problem, CAP, consensus, etc. They occur because the code cannot ever statically make the right choice, so, you need to wrap the theoretical aspects in some deep complexity to either make it deterministic or at least to make it as nearly deterministic as possible.

Deliberately ensuring determinism in your code is part of mastering coding. Making it readable, future-proof, and accurate work estimations are also necessary. Throwing together some code that kinda works is easy. Throwing together a lot of code that always works is difficult.

No comments:

Post a Comment

Thanks for the Feedback!