Wednesday, September 10, 2025

Manifestations

The only two things in a computer are code and data.

Code is a list of instructions for a computer to follow. Data is a symbolic encoding of bits that represents something else.

In the simplest of terms, code is a manifestation of what a programmer knew when they wrote it. It’s a slight over-simplification, but not too far off.

More precisely, some code and some configuration data come directly from a programmer’s understanding.

There could be generated code as well. But in an oddball sense, the code that generated that code was the manifestation, so it is still there.

Any data in the system that has not been ‘collected’ is configuration data. It was understood and placed there by someone.

These days, most code comes from underlying dependencies. Libraries, frameworks, other systems, and products. Interactions with these are glued into the code. The glue code is the author’s understanding, and the dependency code is the understanding of all of the other authors who worked on it.

Wherever and however we boil it down, it comes down to something that some person understood at some point. Code does not spontaneously generate. At least not yet.

The organization and quality of the code come directly from its author. If they are disorganized, the code is disorganized. If they are confused, the code is confused. If they were rushed, the code is weak. The code is what they understand and are able to assemble as instructions for the computer to follow.

Computers are essentially deterministic machines, but the output of code is not guaranteed to be deterministic. There are plenty of direct and indirect ways of injecting non-determinism into code. Determinism is a highly valuable property; you really want it in code, where possible, because it is the anchor property for nearly all users' expectations. If the author does not understand how to do this, the code will not be deterministic, and it is far too easy to make mistakes.

That code is so closely tied to the understandings of its authors that it has a lot of ramifications. The most obvious is that if you do not know something, you cannot write code to accomplish it. You can’t because you do not know what that code should be.

You can use code from someone else who knows, but if there are gaps in their knowledge or it doesn’t quite apply to your situation, you cannot really fix it. You don’t know how to fix it. You can patch over the bad circumstances that you’ve found, but if they are just a drop in a very large bucket, they will keep flowing.

As a consequence, the combined output from a large group of novice programmers will not exceed their individual abilities. It doesn’t matter how many participate; it is capped by understanding. They might be able to glue a bunch of stuff together, as learning how to glue things is a lesser skill than coding them, but all of the risks associated with those dependencies are still there and magnified by the lack of knowledge.

As mentioned earlier, a code generator is just a second level of indirection for the coding issues. It still traces back to people. Any code constructed by any automated process has the same problem, even if that process is sophisticated. Training an LLM to be a dynamic, but still automated, process does not escape this limitation. The knowledge that flowed into the code just comes from more sources, is highly non-deterministic, and rather obviously has even more risk. It’s the same as adding more novice programmers into the mix; it just amplifies the problems. Evidently, we are told that getting enough randomly typing monkeys on typewriters could generate Shakespeare, but that says nothing about the billions of monkeys you’ll need to do it, nor the effort to find that elusive needle in a rather massive haystack. It’s a tree falling in a forest with no one around.

For decades, there have been endless silver bullets launched in an attempt to separate code and configuration data away from the people who need to understand it. As Frederick P. Brooks pointed out in the 1970s, it is not possible. Someone has to issue the instructions, and they cannot do that if they don’t understand them. The work in building software is acquiring that understanding; the code is just the manifestation of that effort. If you don’t do the work, you will not get the software. If you get rid of the people who did the work, you will not be able to continue the work.

Friday, September 5, 2025

Sophistication

Software can be very addictive when you need to use it.

No doubt there are other ways to deal with your problems, but the software just clicks so nicely that you can’t really find any initiative to change.

What makes software addictive is sophistication.

It’s not just some clump of dumb, awkward features. The value is far more than the whole because it all comes together at a higher level, somehow

Usually, it stems from some overlying form of abstraction. A guiding principle permeates all aspects of the work.

There is a simple interface on top that stretches far down into the depths. So, when you use it for a task, it makes it simple to get the work done, but it does the task so fully and completely, in a way that you can actually trust, that it could not have been done any better. You are not left with any lingering doubts or annoying side effects. You needed to do the task; the task is now done. Forever.

Crude software, on the other hand, gets you close, but you are left unsatisfied. It could have been done better; there are plenty of other little things that you have to do now to clean up. It’s not quite over. It’s never really over.

Sophistication is immensely hard to wire into software. It takes a great deal of empathy for the users and the ability to envision their whole path, from before the software gets involved to long afterward. It’s the notion that the features are only a small part of a larger whole, so they have to be carefully tailored for a tight fit.

It requires that you step away from the code, away from the technology, and put yourself directly into the user’s shoes. It is only from that perspective that you can see what ‘full’ and ‘complete’ actually mean.

It is incredibly hard to write sophisticated code. It isn’t just a bunch of algorithms, data structures, and configuration. Each and every tiny part of the system adds or subtracts value from the overall. So the code is deep and complex and often pushes right up against the boundaries of what is really possible with software. It isn’t over-engineering, but it sure ain’t simple either. The code goes straight into the full complexity and depth of the problem. Underneath, it isn’t crude, and it isn’t bloated. It’s a beautiful balance point, right and exactly where the user needs it to be.

Most people can’t pull sophistication out of thin air. It’s very hard to imagine it until you’ve seen it. It’s both fiddly and nitpicky, but also abstract and general. It sits there right in the middle with deep connections into both sides. That’s why it is so rare. The product of a grand master, not just someone dabbling in coding.

Once sophisticated code gets created, because it is so addictive, it has a very, very long lifetime. It outlasts its competitors and usually generations of hollow rewrites. Lots of people throw crude stuff up against it, but it survives.

Sophistication is not something you add quickly. Just the understanding of what it truly means is a long, slow, painful journey. You do not rush it; that only results in crude outcomes. It is a wonderful thing that is unfortunately not appreciated enough anymore.