The Programmer's Paradox: Expression

Thursday, July 18, 2024

Expression

Often I use the term expressibility to mean the width of all possible permutations within some usage of a formal system. So state machines and programming languages have different expressibility. The limits of what you can do with them are different.

But there is another way to look at it.

You decide you want to build a solution. It fits a set of problems. It has an inherent complexity. Programmers visualize that in different ways.

When you go to code it for the computer, depending on the language, it may be more or less natural. That is, if you are going to code some complex mathematical equations, then a mathematics-oriented language like APL would be easier. In nearly the same way we express the math itself, we can write the code.

Although it is equivalent, if you express that same code with any imperative language, you will have to go through a lot more gyrations and transformations in order to fit those equations in the language. Some underlying libraries may help, but you still need to bend what you are thinking in order to fit it into the syntax.

Wherever and whenever we bend, there is a significantly increased likelihood of bugs. The bends tend to hide problems. You can’t just read it back and say “Yeah, that is actually what I was thinking.” The change of expression obscures that.

A long time ago, for large, but routine systems, I remember saying that the code should nearly match the specifications. If the user wrote a paragraph explaining what the code should do, the code that does the work should reflect that almost perfectly.

The variables are the user terminology; the structure is as they described. If it were tight, we could show the author of the spec the finished code and they would be able to mostly understand it and then verify that it is what they want. There would be some syntactic noise, some intermediate values, and some error handling as well, but the user would likely be able to see through all of that and know that it was correct and would match the spec.

That idea works well for specific complex calculations if they are tightly encapsulated in functions, but obviously, systems need a lot of other operational stuff around them to work. Still, the closer you get to that utopia, the more likely that visual inspections will bear fruit.

That doesn’t just affect quality but also enhances debugging and discussions. If someone has a question about how the system is working and you can answer that in a few seconds, it really helps.

Going the other way we can roughly talk about how far away the code drifts from the problem.

The programming language could be awkward and noisy. Expressing some complicated mathematics in assembler for instance would make it way harder to verify. All of the drift would have to be shoved into comments or almost no one could ever understand it.

Some languages require a lot of boilerplate, frameworks, and syntactic sugar, the expression there can bear little resemblance to the original problem.

Abstraction is another cause of drift. The code may solve a much more general problem, then need some specific configurations to scope it down to the actual problem at hand. So the expression is split into parts two.

The big value of abstraction is reuse. Code it once, get it working, and reuse it again for dozens of similar problems, it is a huge time saver, but a little more complex expression.

Expression in this sense isn’t that different from writing. You can say something in plain simple terms or you can hide your message in doublespeak. You might still be communicating the same things, but just making the listener's job a whole lot more difficult.

In the midst of a critical production bug, it really sucks if the expression of the code is disconnected from the behavior. At the extreme, it is spaghetti code. The twists and turns feel nearly random. Oddly, the worse the code expression, the more likely that there will be critical production bugs.

Good code doesn’t run into this issue very often, bad code hits it all of the time. Battletested abstract code is good unless there is a problem, but these are also rare. If you are fixing legacy code, most of what you will encounter will be bad code. The good or well-tested stuff is mostly invisible.

The Programmer's Paradox

Thursday, July 18, 2024

Expression

No comments:

Post a Comment