The only two things in a computer are code and data.
Code is a list of instructions for a computer to follow. Data is a symbolic encoding of bits that represents something else.
In the simplest of terms, code is a manifestation of what a programmer knew when they wrote it. It’s a slight over-simplification, but not too far off.
More precisely, some code and some configuration data come directly from a programmer’s understanding.
There could be generated code as well. But in an oddball sense, the code that generated that code was the manifestation, so it is still there.
Any data in the system that has not been ‘collected’ is configuration data. It was understood and placed there by someone.
These days, most code comes from underlying dependencies. Libraries, frameworks, other systems, and products. Interactions with these are glued into the code. The glue code is the author’s understanding, and the dependency code is the understanding of all of the other authors who worked on it.
Wherever and however we boil it down, it comes down to something that some person understood at some point. Code does not spontaneously generate. At least not yet.
The organization and quality of the code come directly from its author. If they are disorganized, the code is disorganized. If they are confused, the code is confused. If they were rushed, the code is weak. The code is what they understand and are able to assemble as instructions for the computer to follow.
Computers are essentially deterministic machines, but the output of code is not guaranteed to be deterministic. There are plenty of direct and indirect ways of injecting non-determinism into code. Determinism is a highly valuable property; you really want it in code, where possible, because it is the anchor property for nearly all users' expectations. If the author does not understand how to do this, the code will not be deterministic, and it is far too easy to make mistakes.
That code is so closely tied to the understandings of its authors that it has a lot of ramifications. The most obvious is that if you do not know something, you cannot write code to accomplish it. You can’t because you do not know what that code should be.
You can use code from someone else who knows, but if there are gaps in their knowledge or it doesn’t quite apply to your situation, you cannot really fix it. You don’t know how to fix it. You can patch over the bad circumstances that you’ve found, but if they are just a drop in a very large bucket, they will keep flowing.
As a consequence, the combined output from a large group of novice programmers will not exceed their individual abilities. It doesn’t matter how many participate; it is capped by understanding. They might be able to glue a bunch of stuff together, as learning how to glue things is a lesser skill than coding them, but all of the risks associated with those dependencies are still there and magnified by the lack of knowledge.
As mentioned earlier, a code generator is just a second level of indirection for the coding issues. It still traces back to people. Any code constructed by any automated process has the same problem, even if that process is sophisticated. Training an LLM to be a dynamic, but still automated, process does not escape this limitation. The knowledge that flowed into the code just comes from more sources, is highly non-deterministic, and rather obviously has even more risk. It’s the same as adding more novice programmers into the mix; it just amplifies the problems. Evidently, we are told that getting enough randomly typing monkeys on typewriters could generate Shakespeare, but that says nothing about the billions of monkeys you’ll need to do it, nor the effort to find that elusive needle in a rather massive haystack. It’s a tree falling in a forest with no one around.
For decades, there have been endless silver bullets launched in an attempt to separate code and configuration data away from the people who need to understand it. As Frederick P. Brooks pointed out in the 1970s, it is not possible. Someone has to issue the instructions, and they cannot do that if they don’t understand them. The work in building software is acquiring that understanding; the code is just the manifestation of that effort. If you don’t do the work, you will not get the software. If you get rid of the people who did the work, you will not be able to continue the work.
The Programmer's Paradox
Software is a static list of instructions, which we are constantly changing.
Wednesday, September 10, 2025
Friday, September 5, 2025
Sophistication
Software can be very addictive when you need to use it.
No doubt there are other ways to deal with your problems, but the software just clicks so nicely that you can’t really find any initiative to change.
What makes software addictive is sophistication.
It’s not just some clump of dumb, awkward features. The value is far more than the whole because it all comes together at a higher level, somehow
Usually, it stems from some overlying form of abstraction. A guiding principle permeates all aspects of the work.
There is a simple interface on top that stretches far down into the depths. So, when you use it for a task, it makes it simple to get the work done, but it does the task so fully and completely, in a way that you can actually trust, that it could not have been done any better. You are not left with any lingering doubts or annoying side effects. You needed to do the task; the task is now done. Forever.
Crude software, on the other hand, gets you close, but you are left unsatisfied. It could have been done better; there are plenty of other little things that you have to do now to clean up. It’s not quite over. It’s never really over.
Sophistication is immensely hard to wire into software. It takes a great deal of empathy for the users and the ability to envision their whole path, from before the software gets involved to long afterward. It’s the notion that the features are only a small part of a larger whole, so they have to be carefully tailored for a tight fit.
It requires that you step away from the code, away from the technology, and put yourself directly into the user’s shoes. It is only from that perspective that you can see what ‘full’ and ‘complete’ actually mean.
It is incredibly hard to write sophisticated code. It isn’t just a bunch of algorithms, data structures, and configuration. Each and every tiny part of the system adds or subtracts value from the overall. So the code is deep and complex and often pushes right up against the boundaries of what is really possible with software. It isn’t over-engineering, but it sure ain’t simple either. The code goes straight into the full complexity and depth of the problem. Underneath, it isn’t crude, and it isn’t bloated. It’s a beautiful balance point, right and exactly where the user needs it to be.
Most people can’t pull sophistication out of thin air. It’s very hard to imagine it until you’ve seen it. It’s both fiddly and nitpicky, but also abstract and general. It sits there right in the middle with deep connections into both sides. That’s why it is so rare. The product of a grand master, not just someone dabbling in coding.
Once sophisticated code gets created, because it is so addictive, it has a very, very long lifetime. It outlasts its competitors and usually generations of hollow rewrites. Lots of people throw crude stuff up against it, but it survives.
Sophistication is not something you add quickly. Just the understanding of what it truly means is a long, slow, painful journey. You do not rush it; that only results in crude outcomes. It is a wonderful thing that is unfortunately not appreciated enough anymore.
No doubt there are other ways to deal with your problems, but the software just clicks so nicely that you can’t really find any initiative to change.
What makes software addictive is sophistication.
It’s not just some clump of dumb, awkward features. The value is far more than the whole because it all comes together at a higher level, somehow
Usually, it stems from some overlying form of abstraction. A guiding principle permeates all aspects of the work.
There is a simple interface on top that stretches far down into the depths. So, when you use it for a task, it makes it simple to get the work done, but it does the task so fully and completely, in a way that you can actually trust, that it could not have been done any better. You are not left with any lingering doubts or annoying side effects. You needed to do the task; the task is now done. Forever.
Crude software, on the other hand, gets you close, but you are left unsatisfied. It could have been done better; there are plenty of other little things that you have to do now to clean up. It’s not quite over. It’s never really over.
Sophistication is immensely hard to wire into software. It takes a great deal of empathy for the users and the ability to envision their whole path, from before the software gets involved to long afterward. It’s the notion that the features are only a small part of a larger whole, so they have to be carefully tailored for a tight fit.
It requires that you step away from the code, away from the technology, and put yourself directly into the user’s shoes. It is only from that perspective that you can see what ‘full’ and ‘complete’ actually mean.
It is incredibly hard to write sophisticated code. It isn’t just a bunch of algorithms, data structures, and configuration. Each and every tiny part of the system adds or subtracts value from the overall. So the code is deep and complex and often pushes right up against the boundaries of what is really possible with software. It isn’t over-engineering, but it sure ain’t simple either. The code goes straight into the full complexity and depth of the problem. Underneath, it isn’t crude, and it isn’t bloated. It’s a beautiful balance point, right and exactly where the user needs it to be.
Most people can’t pull sophistication out of thin air. It’s very hard to imagine it until you’ve seen it. It’s both fiddly and nitpicky, but also abstract and general. It sits there right in the middle with deep connections into both sides. That’s why it is so rare. The product of a grand master, not just someone dabbling in coding.
Once sophisticated code gets created, because it is so addictive, it has a very, very long lifetime. It outlasts its competitors and usually generations of hollow rewrites. Lots of people throw crude stuff up against it, but it survives.
Sophistication is not something you add quickly. Just the understanding of what it truly means is a long, slow, painful journey. You do not rush it; that only results in crude outcomes. It is a wonderful thing that is unfortunately not appreciated enough anymore.
Friday, August 22, 2025
Ordering
For any given amount of work, there is at least one order of doing that work which is the most efficient.
If there are dependencies with the sub-tasks, then there is at least one order of doing the work that is least efficient.
For any two tasks, there may be obvious dependencies, but there may be non-obvious secondary ones as well. If one task requires speed and the other requires strength, even though they do seem to be unrelated, there could be issues like muscle fatigue or tiredness of a person if they are the same one for both tasks.
With most tasks, most of the time, you should assume there are known and unknown dependencies, which means there is very likely, almost always, a most-efficient set of orderings. Assume it is rare for this not to be the case.
For any given dependency, its effect is to reweight the effort needed for both tasks. Doing the one task first means the second one will now take a little longer. We refer to this as friction on the second task.
Like dependencies, there is obvious friction and then non-obvious friction. If you do some task and many of the lower sub-tasks take a little longer, but you don’t know why, there is some non-obvious friction happening, which indicates that there are some non-obvious dependencies involved.
All this applies heavily to software development. When building a big system, there are ways through all of the tasks that are more efficient than others. From experience, the difference is a multiplier. You could spend 3x more effort building it one way than some other way, for example. In practice, I have seen much higher multipliers, like 10x or even crazy ones like 100x.
It’s sometimes not obvious, as large projects span long periods and have many releases in between. You’d have to step back and observe the whole lifecycle of any given project to get a real sense of the damage that some more subtle types of friction have caused.
But the roots of the friction are often the same. Someone is trying to do one task that is dependent on another one before the foundational task is completed. Which means that changes to the lower task as it goes are causing extra work for the higher one.
We can skip over architectural discussions about height and just simply assess whether one piece of code or data depends on another piece of code or data. That is a dependency which, when handled out of order, creates friction.
Overall, it always means that you should build things from the bottom up. That would always be the most efficient way of getting through the tasks. Practically, that is not always possible, at least overall, but it is often possible within thin verticals in the system. If you add a new feature, it would be most efficient to address the modelling and persistence of the data first, then gradually wire it in from there until you get to the user interface. What might have driven the need for such a feature was the user experience or their domain problems, and that analysis is needed before the coding starts, which is top/down. But then after that, you flip the order for the implementation to bottom-up, and that would be the fastest that you can make it happen.
That the order is flipped depending on the stage is counterintuitive, which is why it is so controversial. But if you work it back from the first principles above, you can see why this happens.
In development, order is important. If you do build too slowly, that spins off politics, which often starts to further degrade the order, so getting control of this is vital to getting the work out as efficiently and smoothly as possible. Most people will suggest inefficient orders, based on their own understanding, so it is better to not let it be in the hands of most people.
If there are dependencies with the sub-tasks, then there is at least one order of doing the work that is least efficient.
For any two tasks, there may be obvious dependencies, but there may be non-obvious secondary ones as well. If one task requires speed and the other requires strength, even though they do seem to be unrelated, there could be issues like muscle fatigue or tiredness of a person if they are the same one for both tasks.
With most tasks, most of the time, you should assume there are known and unknown dependencies, which means there is very likely, almost always, a most-efficient set of orderings. Assume it is rare for this not to be the case.
For any given dependency, its effect is to reweight the effort needed for both tasks. Doing the one task first means the second one will now take a little longer. We refer to this as friction on the second task.
Like dependencies, there is obvious friction and then non-obvious friction. If you do some task and many of the lower sub-tasks take a little longer, but you don’t know why, there is some non-obvious friction happening, which indicates that there are some non-obvious dependencies involved.
All this applies heavily to software development. When building a big system, there are ways through all of the tasks that are more efficient than others. From experience, the difference is a multiplier. You could spend 3x more effort building it one way than some other way, for example. In practice, I have seen much higher multipliers, like 10x or even crazy ones like 100x.
It’s sometimes not obvious, as large projects span long periods and have many releases in between. You’d have to step back and observe the whole lifecycle of any given project to get a real sense of the damage that some more subtle types of friction have caused.
But the roots of the friction are often the same. Someone is trying to do one task that is dependent on another one before the foundational task is completed. Which means that changes to the lower task as it goes are causing extra work for the higher one.
We can skip over architectural discussions about height and just simply assess whether one piece of code or data depends on another piece of code or data. That is a dependency which, when handled out of order, creates friction.
Overall, it always means that you should build things from the bottom up. That would always be the most efficient way of getting through the tasks. Practically, that is not always possible, at least overall, but it is often possible within thin verticals in the system. If you add a new feature, it would be most efficient to address the modelling and persistence of the data first, then gradually wire it in from there until you get to the user interface. What might have driven the need for such a feature was the user experience or their domain problems, and that analysis is needed before the coding starts, which is top/down. But then after that, you flip the order for the implementation to bottom-up, and that would be the fastest that you can make it happen.
That the order is flipped depending on the stage is counterintuitive, which is why it is so controversial. But if you work it back from the first principles above, you can see why this happens.
In development, order is important. If you do build too slowly, that spins off politics, which often starts to further degrade the order, so getting control of this is vital to getting the work out as efficiently and smoothly as possible. Most people will suggest inefficient orders, based on their own understanding, so it is better to not let it be in the hands of most people.
Friday, August 15, 2025
Bad Engineering
Lots of people believe that if you just decompose a big problem into enough little ones, it is solved.
A lot of the time, though, the decomposition isn’t into complete sub-problems but just partial ones. The person identifies a sub-problem, they bite off a little part of it, and then push the rest back out.
A good example is if some data needs processing, so someone builds a middleware solution to help, but it is configurable to let the actual processing be injected. So it effectively wraps the processing, but doesn’t include any actual processing, or minimal versions if it.
Then someone comes along and needs that processing. They learn this new tech, but then later realize that it doesn’t really solve their problem, and now they have a lot more fragments that still need to be solved.
Really, it just splintered into sub-problems; it didn’t solve anything. It’s pure fragmentation, not encapsulation. It’s not really a black box if it’s just a shell to hold the actual boxes …
If you do this a lot as the basis for a system, the sheer number of moving parts will make the system extraordinarily fragile. One tiny, unfortunate change in any of the fragments and it all goes horribly wrong. Worse, that bad change is brutal to find, as it could have been anywhere in any fragment. If it wasn’t well organized and repo’d then it is a needle in a haystack.
Worse, each fragment's injection is different from all of the other fragments’ injections. There is a lot of personality in each component configuration. So instead of having to understand the problem, you now have to understand all of these fragmented variations and how they should all come together, which is often far more complex than the original problem. So you think you've solved it, but instead you just made it worse.
If you look at many popular tech stacks, you see a huge amount of splinter tech dumped there.
They become popular because people think they are a shortcut to not having to understand the problems, and only realize too late that it is the long, treacherous road instead.
Companies like to build splinter tech because it is fast and relatively easy to get to market. You can make great marketing claims, and by the time the grunts figure it out, it is too late to toss, so it is sticky.
Splinter tech is bad engineering. It is both bloat and obfuscation. Fragments are a big complexity multiplier. A little of it might be necessary, but it stacks up quickly. Once it is out of control, there is no easy way back,
It hurts programmers because they end up learning all these ‘component du jour’ oddities, then the industry moves on, and that knowledge is useless. Some other group of splinter tech hackers will find a completely different and weird way of doing similar things later. So it's temporary knowledge with little intrinsic value. Most of this tech has a ten-year or less lifespan. Here today, gone tomorrow. Eventually, people wake up and realize they were duped.
If you build on tech with a short life span, it will mostly cripple your work’s lifespan. The idea is not to grind out code, but to solve problems in ways that stay solved. If it decays rapidly, it is a demo, not a system. There is a huge difference between those two.
If you build on top of bad engineering, then that will define your work. It is bad by construction. You cannot usually un-bad it if you’re just a layer of light work or glue on top. Its badness percolates upwards. Your stuff only works as well as the components it was built on.
A lot of the time, though, the decomposition isn’t into complete sub-problems but just partial ones. The person identifies a sub-problem, they bite off a little part of it, and then push the rest back out.
A good example is if some data needs processing, so someone builds a middleware solution to help, but it is configurable to let the actual processing be injected. So it effectively wraps the processing, but doesn’t include any actual processing, or minimal versions if it.
Then someone comes along and needs that processing. They learn this new tech, but then later realize that it doesn’t really solve their problem, and now they have a lot more fragments that still need to be solved.
Really, it just splintered into sub-problems; it didn’t solve anything. It’s pure fragmentation, not encapsulation. It’s not really a black box if it’s just a shell to hold the actual boxes …
If you do this a lot as the basis for a system, the sheer number of moving parts will make the system extraordinarily fragile. One tiny, unfortunate change in any of the fragments and it all goes horribly wrong. Worse, that bad change is brutal to find, as it could have been anywhere in any fragment. If it wasn’t well organized and repo’d then it is a needle in a haystack.
Worse, each fragment's injection is different from all of the other fragments’ injections. There is a lot of personality in each component configuration. So instead of having to understand the problem, you now have to understand all of these fragmented variations and how they should all come together, which is often far more complex than the original problem. So you think you've solved it, but instead you just made it worse.
If you look at many popular tech stacks, you see a huge amount of splinter tech dumped there.
They become popular because people think they are a shortcut to not having to understand the problems, and only realize too late that it is the long, treacherous road instead.
Companies like to build splinter tech because it is fast and relatively easy to get to market. You can make great marketing claims, and by the time the grunts figure it out, it is too late to toss, so it is sticky.
Splinter tech is bad engineering. It is both bloat and obfuscation. Fragments are a big complexity multiplier. A little of it might be necessary, but it stacks up quickly. Once it is out of control, there is no easy way back,
It hurts programmers because they end up learning all these ‘component du jour’ oddities, then the industry moves on, and that knowledge is useless. Some other group of splinter tech hackers will find a completely different and weird way of doing similar things later. So it's temporary knowledge with little intrinsic value. Most of this tech has a ten-year or less lifespan. Here today, gone tomorrow. Eventually, people wake up and realize they were duped.
If you build on tech with a short life span, it will mostly cripple your work’s lifespan. The idea is not to grind out code, but to solve problems in ways that stay solved. If it decays rapidly, it is a demo, not a system. There is a huge difference between those two.
If you build on top of bad engineering, then that will define your work. It is bad by construction. You cannot usually un-bad it if you’re just a layer of light work or glue on top. Its badness percolates upwards. Your stuff only works as well as the components it was built on.
Friday, August 8, 2025
Static vs Dynamic
I like the expression ‘the rubber meets the road’.
I guess it is an expression about driving, the rubber is tires, maybe, but it also applies in a rather interesting way to software.
When a software program runs, it issues millions, if not billions, of very, very specific instructions for the computer to follow.
When we code, we can add variability to that, so we can make one parameter an integer, and we can issue the exact same instructions but with different values. We issue them for value 20, then we issue them again for 202, for example.
That, relative to the above expression, is the rubber meeting the road twice, once for each value.
Pull back a little from that, and what we have is a ‘context’ of variability, that we actuate to get the instructions with a rather specific value for each variable.
In programming, if we just hardcode a value into place, it is not a variable. We tend to call this ‘static’, being that it doesn’t change. When the rubber hits the road, it was already hardcoded.
If we allow it to vary, then the code is at least ‘dynamic’ on that variable. We pick from a list of possible options, then shove it in, and execute the whole thing.
The way we can pick can be picking directly from a list of possible values, or we can have ‘levels of indirection’. We could have a ‘pointer’ in the list that we use to go somewhere else and get the value, thus one level of indirection. Or we could stack the indirections so that we have to visit a whole bunch of different places before the rubber finally meets the road.
With the instructions, we can pretty much make any of the data they need variable. But we can also make the instructions variable, and oddly, the number of instructions can vary too. So, we have degrees of dynamic behaviour, and on top, we can throw in all sorts of levels of indirection.
From a complexity perspective, for each and every thing we make dynamic and for each and every level of indirection, we have kicked up the complexity. Static is the simplest we can do, as we need that instruction to exist and do its thing. Everything else is more complex on top.
From an expressibility and redundancy perspective, making a lot of stuff dynamic is better. You don’t have to have similar instructions over and over again, and you can use them for a much wider range of problems.
If you were to make a specific program fully dynamic, you would actually just end up with a domain programming language. That is, taken too far, since the rubber has to meet the road at some point at runtime, the code itself would end up being refactored into a full language. We see this happen quite often, where so many features get piled on, and then someone points out that it has become Turing Complete. You’ve gone a little too far at that point, unless the point was to build a DSL. Then, for instance, SQL being Turing complete is actually fine, full persistence solutions are DSLs almost by definition. Newer implementations of REs being Turing complete, however, is a huge mistake since they corrode the polymorphic behaviour guarantees that make REs so useful.
All of this gets us back to the fundamental tradeoff between static and dynamic. Crafting similar things over and over again is massively time-consuming. Doing it once, but making some parts variable is far better. But making everything dynamic goes too far, and the rubber still needs to meet the road. Making just enough dynamic that you can reuse it everywhere is the goal, but throwing in too many levels of indirection is essentially just fragmenting it all into a nightmare.
There is no one-size-fits-all approach that always works, but for any given project, there is a better degree of dynamic code that is the most efficient over the longer term. So if you know that you’ll use the same big lump of code 7 times in the solution, then adding enough variability to cover all 7 with the same piece of code is best, and getting all 7 static configs for this in the same place is perfect. That would minimize everything, so the best you can do.
I guess it is an expression about driving, the rubber is tires, maybe, but it also applies in a rather interesting way to software.
When a software program runs, it issues millions, if not billions, of very, very specific instructions for the computer to follow.
When we code, we can add variability to that, so we can make one parameter an integer, and we can issue the exact same instructions but with different values. We issue them for value 20, then we issue them again for 202, for example.
That, relative to the above expression, is the rubber meeting the road twice, once for each value.
Pull back a little from that, and what we have is a ‘context’ of variability, that we actuate to get the instructions with a rather specific value for each variable.
In programming, if we just hardcode a value into place, it is not a variable. We tend to call this ‘static’, being that it doesn’t change. When the rubber hits the road, it was already hardcoded.
If we allow it to vary, then the code is at least ‘dynamic’ on that variable. We pick from a list of possible options, then shove it in, and execute the whole thing.
The way we can pick can be picking directly from a list of possible values, or we can have ‘levels of indirection’. We could have a ‘pointer’ in the list that we use to go somewhere else and get the value, thus one level of indirection. Or we could stack the indirections so that we have to visit a whole bunch of different places before the rubber finally meets the road.
With the instructions, we can pretty much make any of the data they need variable. But we can also make the instructions variable, and oddly, the number of instructions can vary too. So, we have degrees of dynamic behaviour, and on top, we can throw in all sorts of levels of indirection.
From a complexity perspective, for each and every thing we make dynamic and for each and every level of indirection, we have kicked up the complexity. Static is the simplest we can do, as we need that instruction to exist and do its thing. Everything else is more complex on top.
From an expressibility and redundancy perspective, making a lot of stuff dynamic is better. You don’t have to have similar instructions over and over again, and you can use them for a much wider range of problems.
If you were to make a specific program fully dynamic, you would actually just end up with a domain programming language. That is, taken too far, since the rubber has to meet the road at some point at runtime, the code itself would end up being refactored into a full language. We see this happen quite often, where so many features get piled on, and then someone points out that it has become Turing Complete. You’ve gone a little too far at that point, unless the point was to build a DSL. Then, for instance, SQL being Turing complete is actually fine, full persistence solutions are DSLs almost by definition. Newer implementations of REs being Turing complete, however, is a huge mistake since they corrode the polymorphic behaviour guarantees that make REs so useful.
All of this gets us back to the fundamental tradeoff between static and dynamic. Crafting similar things over and over again is massively time-consuming. Doing it once, but making some parts variable is far better. But making everything dynamic goes too far, and the rubber still needs to meet the road. Making just enough dynamic that you can reuse it everywhere is the goal, but throwing in too many levels of indirection is essentially just fragmenting it all into a nightmare.
There is no one-size-fits-all approach that always works, but for any given project, there is a better degree of dynamic code that is the most efficient over the longer term. So if you know that you’ll use the same big lump of code 7 times in the solution, then adding enough variability to cover all 7 with the same piece of code is best, and getting all 7 static configs for this in the same place is perfect. That would minimize everything, so the best you can do.
Subscribe to:
Posts (Atom)