Friday, March 7, 2025

Concurrency

For large complex code, particularly when it is highly configurable, polymorphic, or has dependency injection, it is sometimes difficult to get a clear and correct mental model of how it actually runs. The code might bounce around in unexpected ways. Complicated flow control is the source of a lot of bugs.

Now take that, but interlace multiple instances of it, each interacting with the same underlying variables, and know that it can get far worse.

The problem isn’t the instructions; it is any and all of the variables that they touch.

If there is a simple piece of code that just sets two variables, it is possible that one instance sets the first one, but then is leapfrogged by another instance setting both. After the first instance finally sets the last variable, the state is corrupt. The value of the variables is a mix between the two concurrent processes. If the intent of the instructions was to make sure both are consistent, it didn’t work.

If it was just an issue with a single function call that could be managed by adding a critical section on top of the two assignments, but more often than not, a function call is the root of a tree of execution that can include all sorts of other function calls with similar issues.

A simple way around this is to add big global locks around stuff, but that effectively serializes the execution, thus defeating most of the benefits of concurrency. You just added a boatload of extra complexity for nothing.

It doesn’t help if you are just reading variables if they can be changed elsewhere. A second writing thread could be half completed during the read, which is the same problem as before.

You can make it all immutable, but you still have to be very concerned with the lifetime of the data. Where it is loaded or deleted from memory can get corrupted too. You can only ignore thread safety when everything is preloaded first, strictly immutable, and never deleted.

Most days, concurrency is not worth the effort. You need it for multi-use code like a web server, but you also want to ensure the scope of any variable is tightly bound to the thread of execution. That is, at the start of the thread, you create everything you need, and it is never shared. Any global you need is strictly immutable and never changes or gets deleted. Then you can ignore it all. Locking is too easy to get wrong.

Some neat language primitives like async seem to offer simple-to-use features, but if you don’t understand their limits, the cost is Heisenbugs. Strange concurrency corruptions that are so low in frequency people confuse them with random hiccups. They might occur once per year, for example, so they are entirely impossible to replicate; they tend to stay around forever and agitate the users.

If you aren’t sure, then serial execution is best. At least for the start. If a language offers features to catch absolutely every possible concurrent issue, that is good too, but one that only catches ‘most’ of them is useless because ‘some’ of them are still out there.

Most concurrent optimizations are only ever micro optimizations. The performance gains do not justify the risks involved. It is always far better to focus on normalizing the data and being frugal with the execution steps, as they can often net huge macro optimizations.

The hard part about concurrency is not learning all of the primitives but rather having a correct mental model of how they all interoperate so that the code is stable.

No comments:

Post a Comment

Thanks for the Feedback!