The Programmer's Paradox: September 2024

Thursday, September 26, 2024

Complexity

All “things” are distinguished by whether they can be reasonably decomposed into smaller things. The lines of decomposition are the similarities or differences. Not only do we need to break things down into their smallest parts, but we also need to understand all of the effects between the parts.

Some ‘thing’ is simple if it defies decomposition. It is as broken down as it can be. It gets more complex as the decompositions are more abundant. It is multidimensional relative to any chosen categorizations, so it isn’t easy to compare relative complexity. But it is easy to compare it back to something on the same set of axes that is simpler. One hundred things is marginally more complex than five things.

This is also true of “events”. A single specific event in and of itself is simple, particularly if it does not involve any ‘things’. Events get more complex as they are related to each other. Maybe just sequential or possibly cause and effect. A bunch of related events is more complex than any single one. Again it is multidimensional based on categorizations, but also as events can involve things, this makes it even more multidimensional.

For software, some type of problem exists in reality and we have decided that we can use software to form part of a solution to solve it. Most often, software is only ever a part of the solution, as the problem itself is always anchored in reality, there have to be physical bindings.

Users for instance are just digital collections of information that represent a proxy to one or more people who utilize the software. Without people interacting or being tracked with the software, the concept of users is meaningless.

Since we have a loose relative measure for complexity, we can get some sense of the difference between any two software solutions. We can see that the decomposition for one may be considerably simpler than another for example, although it gets murky when it crosses over trade-offs. Two nearly identical pieces of software may only really differ by some explicit trade-off, but as the trad-offs are anchored in reality, they would not share the same complexity. It would be relative to their operational usage, a larger context.

But if we have a given problem and we propose a solution that is vastly simpler than what the problem needs we can see that the solution is “oversimplified”. We know this because the full width of the solution does not fit the full width of the problem, so parts of the problem are left exposed. They may require other software tools or possibly physical tools in order to get the problem solved.

So, in that sense, the fit of any software is defined by its leftover unsolved aspects. These of course have their own complexity. If we sum up the complexities with these gaps, and if there were any overlaps, that gives us the total complexity of the solution. In that case, we find that an oversimplified solution has a higher overall complexity than a properly fighting solution. Not individually, but overall.

We can see that the problem has an ‘intrinsic’ complexity. The partial fit of any solution must be at least as complex as the parts it covers. All fragments and redundancies have their own complexity, but we can place their existence in the category of ‘artificial’ complexity relative to any better-fitting solution.

We might see that in terms of a GUI that helps to keep track of nicely formatted text. If the ability to create, edit, and delete the text is part of the solution, but it lacks the ability to create, edit, and delete all of the different types of required formatting then it would force the users to go to some outside tool to do that work. So, it’s ill-fitting. A second piece of software is required to work with it. The outside tools themselves might have an inherent intrinsic complex, but relative to the problem we are looking at, having to learn and use them is just artificial complexity. Combined that is significantly more than if the embedded editing widget in the main software just allowed for the user to properly manage the formatting.

Keep in mind that this would be different than the Unix philosophy of scripting, as in that case, there are lots of little pieces of software, but they all exist as intrinsic complexity ‘within’ the scripting environment. They are essentially inside of the solution space, not outside.

We can’t necessarily linearize complexity and make it explicitly comparable, but we can understand that one instance of complexity is more complex than another. We might have to back up the context a little to distinguish that, but it always exists. We can also relate complexity back to work. It would be a specific amount of work to build a solution to solve some given problem, but if we meander while constructing it, obviously that would be a lot more work. The shortest path, with the least amount of work, would be to build the full solution as effectively as possible so that it fully covers the problem. For software, that is usually a massive amount of work, so we tend to do it in parts, and gradually evolve into a more complete solution. If the evolution wobbles though, that is an effort that could have been avoided.

All of this gives us a sense that the construction of software as a solution is driven by the understanding and controlling of complexity. Projects are smooth if you understand the full complexities of the problem and find the best path forward to get them covered properly by a software solution as quickly as possible. If you ignore some of those complexities, inherent or artificial, they tend to generate more complexity. Eventually, if you ignore enough of them the project gets out of control and usually meets an undesirable fate. Building software is an exercise in controlling complexity increases. Every decision is about not letting it grow too quickly.

Sunday, September 22, 2024

Editing Anxieties

An all too common problem I’ve seen programmers make is to become afraid of changing their code.

They type it in quickly. It’s a bit muddled and probably a little messy.

Bugs and changes get requested right away as it is not doing what people expect it to do.

The original author and lots of those who follow, seek to make the most minimal changes they can. They are dainty to the code. They only do the littlest things in the hopes of improving it.

But the code is weak. It was fully thought out; it was poorly implemented.

It would save a lot of time to make rather bigger changes. Bold ones. Not to rewrite it, but rather to take what is there as a wide approximation to what should have been there instead.

Break it up into lots of functions.

Rename the variables and the function name.

Correct any variable overloading. Throw out redundant or unused variables.

Shift around the structure to be consistent, moving lines of code up or down the call stack.

All of those are effectively “nondestructive” refactoring. They will not change what the code is doing, but they will make it earlier to understand it.

Nondestructive refactors are too often avoided by programmers, but they are an essential tool in fixing weak codebases.

Once you’ve cleaned up the mess, and it is obvious what the code is doing, then you can decide how to make it do what it was supposed to do in the first place. But you need to know what is there first, in order to correctly change it.

If you avoid fixing the syntax, naming, inconsistencies, etc it will not save time, only delay your understanding of how to get the code to where it should be. A million little fixes will not necessarily converge on correct code. It can be endless.

Thursday, September 12, 2024

Ambiguities

What is unknowable is just not knowable. It is an ambiguity. It could be one answer or it could be any other. There is no way to decide. The information to do so simply does not exist. It isn't open to discussion, it is a byproduct of our physical universe.

The languages we use to drive computers are often Turing Complete. This means they are highly expressive, that they are deterministic, yet buried beneath them there can also ambiguities. Some are accidentally put there by programmers; some are just intrinsically embedded.

I saw a post about a bug caused by one of the newer configuration languages. It was mistaken in the type of its data and made a really bad assumption about it. That is a deliberate ambiguity.

We choose implicit typing far too often just to save expression, but then we overload it later with polymorphic types, and we sometimes don’t get the precedence right. To be usable the data must be typed, but because it is left to the software to decide, and the precedence is incorrect, it forms an ambiguity. The correct data type is unknown and cannot be inferred. Mostly it works, right up until it doesn’t. If the original programmers forced typing, or the precedence was tight, the issue would go away.

My favorite ambiguity is the two generals’ problem. It sits deep in the heart of transactional integrity, plaguing distributed systems everywhere. It is simple: if one computational engine sends a message to another one and receives no response, you can never know if it was the message that disappeared or the response. The information needed to correctly choose just doesn’t exist.

If the message was an action, you can’t know right away if the action happened or not.

It is the cause of literally millions of bugs, some encountered so frequently that they are just papered over with manual interventions.

What makes it so fascinating is that although you can never reduce the ambiguity itself, you can still wire up the behavior to fail so infrequently that you might never encounter the problem in your lifetime. That is some pretty strange almost non-deterministic behavior.

Ambiguities are one of the great boundaries of software. Our code is limited. We can endeavor to craft perfect software, but can never exceed the imperfections of reality. Software isn’t nearly as soft as we would like to believe, it is always tied to physical machines, which so often mess up the purity of their formal mechanics.

Sometimes to build good software you have to have both a strong understanding of what it should do, but also a strong understanding of what it cannot do. That balance is important.

Thursday, September 5, 2024

Value

We’ll start by defining ‘value’ as a product or service that somebody needs or wants. Their desire is strong enough that they are willing to pay something to get it.

There is a personal cost, in that money -- particularly when it comes from salary -- is an aspect of time. You have to spend time in your life to earn it. So, spending that money is also accepting that that time is gone too.

It is extraordinarily difficult to create value. Very few people can do it by themselves. There are not that many more people who can drive others to do it with them. Often it requires manifesting something complex, yet balanced, out of imagination, which is why it is such a rare ability.

Once value exists, lots of other people can build on top of it. They can enhance or expand it. They can focus on whatever parts of the production are time-consuming and minimize it. Value explodes when it is scaled; doing something for ten people is not nearly as rewarding as doing it for thousands.

The difference between the cost to produce and willingness to pay is profit. There is always some risk, as you have to spend money before you will see if it pays off. For new ideas, the risk is very high. But as the value proves itself those risks start to diminish. At some point, they are nearly zero, this is often referred to as ‘printing money’. You just need to keep on producing the value and collecting the profits.

At some point, the growth of the profitability of any value will be zero. The demand is saturated. If the production at zero is smooth, the profits are effectively capped. There is nowhere to go except down.

The easy thing you can do with value is degrade it. You can fall below the minimum for the product, then people will gradually stop wanting to pay for it. If it’s been around for a while habits will keep some purchases intact, but it will slowly trickle away. The profits will eventually disappear.

A common way to degrade value is to miss what is most valuable about it. People may be buying it, but their reasons for doing so are not necessarily obvious. The secondary concerns may be what is keeping them coming back. If you gut those for extra profit, you kill the primary value.

People incorrectly assume that all value has a life span. That a product or service lives or dies fairly quickly, and that turnover is normal. That is true for trendy industries, where the secondary aspects are that it is new and that is popular, but there is an awful lot of base value that people need in their lives and they do not want it to be shifting around. Trustworthy constants are strong secondary attributes. The market may fluctuate slightly, but the value is as strong as ever. Stable business with no growth, but good long-term potential.

Exploiting value is the core of business. They find it, get as much money out of it as possible, then move on to something else. When the goal is perpetual growth, which isn’t ever possible, the value turns over too quickly, leading to a lot of instability. That is a destabilizing force on any society.