The Programmer's Paradox: Computations

Friday, April 18, 2025

Computations

We can take a set of ‘inputs’ and grind through that information to produce a set of ‘outputs’.

If the grinding is 100% repeatable, barring a physical problem, it will always return exactly the same information; then it is deterministic. Otherwise, it is probabilistic. It may work, but sometimes it will not.

If the output is exactly correct, it is an algorithm. If it is a close approximation to what was needed, it is a heuristic. It tries its best.

If there are ambiguities in the data that need to be worked around, it is non-deterministic. Sometimes, there are trade-offs to do this, but only if expressibility is bounded.

When we build software, we absolutely prefer deterministic algorithms. This tends to match most users' expectations. All the other variations are second choices, used only when we’ve hit the limits of computability.

We can collect most of the data we need and derive extra stuff from it. Collecting data is crazy expensive, so we really don’t want to duplicate it. Usually, burning a little more CPU for derived data is a far cheaper alternative to keeping multiple copies. Not always, though.

We also usually need to store a lot of historical data. Any sort of manual interference with the collected data and any kind of operation changes. We want these deltas, usually going back years if not decades. This allows us to diagnose problems correctly, usually with people or disorganization.

The toughest part of software is context. The inputs into any code are never really local; they come from the operational and usage sides and get filtered through a lot of hoops. This is the largest and most complex source of bugs for most software. The code would match expectations, but the inputs throw it off.

In general, the computations are simple when broken down, but they can be stacked up to extraordinary complexity. Understanding what’s needed for the outputs strengthens the ability to deliver them. Blindly fiddling with bits tends to produce a lot of bugs. Blindly assuming lots about the behaviour of integrations produces a lot of nasty bugs as well.

Since the computations exist to match user expectations, consistency is vital. Creative alternatives are usually perceived as glitches. It is better to just do what is expected than it is to go somewhere eclectic.

That may seem to make programming itself rather boring, but it is not true. It's the output that should be boring; how the computation gets there does not need to be, and imagination and creativity are necessary to get past the obvious brute force. The quality of software comes from editing and refining the code, not just typing it in really fast. As we iterate, we kick up quality. It always starts quite low.

Gluing code together is not the same skill as writing it. You are not building computations; you are just connecting them to each other. They were designed and built by other people. Putting this into a framework and connecting all of the wires is also gluing it together.

The best computations are the ones that do the absolute minimum amount of work. Obviously, redoing the same work over and over again is a waste of effort. Also, sometimes, utilizing less obvious information, you can bypass the long way of doing the work and get to the results a lot faster.

The Programmer's Paradox

Friday, April 18, 2025

Computations

No comments:

Post a Comment