There are two basic ways of writing software code: experimentation and visualization.
With experimentation, you add a bunch of lines of code, then run it to see if it worked. As it is rather unlikely to work the first time, you modify some of the code and rerun. You keep this up until you a) get all the code you need and b) it does what you expect it to do.
For visualization, you think about what the code needs to do first. Maybe you draw a few pictures, but really, the functionality of the code is in your head. You are “seeing” it in some way. Then, once you are sure that that is the code you need, you type it out line by line to be as close as you can to the way you imagined it. After you’ve fixed typos and syntactic problems, the code should behave in the way you intended.
Experimentation is where everyone starts when they learn programming. You just have to keep trying things and changing them until the code behaves in the way you want it to.
What’s important, though, is if the code does not work as expected, which is common, you dig a little to figure out why it didn’t work. Learn from failure. But some people will just keep making semi-random changes to the code, hoping to stumble on a working version.
That isn’t so bad where there are only a small number of permutations; you end up visiting most of them, but for bigger functionality, there can be a massive number of permutations, and in some cases, it can be infinite. If you are not learning from each failure, it could take an awfully long time before you stumble upon the right changes. By avoiding learning something from each failure, you cap your abilities to fairly small pieces of code.
Instead, the best approach is to hypothesize about what will happen each time before you run the code. When the code differs, and it mostly will, you use that difference as a reason to dig into what’s underneath. Little by little, you will build up a stronger understanding of what each line of code does, what they do in combination, and how you can better leverage them. Randomly changing things and ignoring the failures wastes a lot of time and misses the necessity for you to learn stuff.
Visualization comes later, once you’ve started to build up a strong internal model of what’s happening underneath. You don’t have to write code to see what happens; instead, you can decide what you want to happen and then just make the code do that. This opens the door to you not only for writing bigger things, but also being able to writing far more sophisticated things. A step closer to mastering coding.
Experimentation is still a necessity, though. Bad documentation, weak technologies, weird behaviours; modern software is a mess and getting a little worse each year. As long as we keep rushing through the construction, we’ll never get a strong, stable foundation. We’re so often building on quicksand these days.
The Programmer's Paradox
Software is a static list of instructions, which we are constantly changing.
Thursday, October 9, 2025
Thursday, October 2, 2025
The Value of Thought
You can randomly issue millions of instructions to a computer.
It is possible that when they are executed, good things will happen, but the odds of that are infinitesimally small.
If you need a computer to do anything that is beyond trivial, then you will need a lot of carefully constructed instructions to make it succeed.
You could try to iterate your way into getting these instructions by experimentation, using trial and error. For all of the earlier iterations just before the final successful one, though, some amount of the included instructions will essentially be random, so as initially stated, the odds that you blunder into the right instructions are tiny.
Instead, even if you are doing some experimentation, you are doing that to build up an internal understanding of how the instructions relate back to the behaviors of the computer. You are building a mental model of how those instructions work.
To be good at programming, you end up having to be good at acquiring this knowledge and using it to quickly build up models. You have to think very carefully about what you are seeing, how it behaves, and what you’d prefer it to have done instead.
These thoughts allow you to build up an understanding that is then manifested as code, which are the instructions given to the computer.
Which is to say that ‘coding’ isn’t the effort, thinking is. Coding is the output from acquiring an understanding of the problem and a possible solution to it. The software is only as good as the thoughts put into it.
If you approach the work too shallowly, then the software will not fit all of the expected behaviours. If the problems to be solved are deep and complex, then the knowledge needed to craft a good solution will also be deep and complex.
We see and acknowledge the value of the existing code, essentially as a form of intellectual property, but we are not properly valuing the knowledge, skills, time, and deep thinking that are necessary to have created such code. Software is only as good as the understanding of the programmers who created it. If they are clueless, the software is close to random. If they only understand a little part of what they are doing, the missing knowledge is getting randomized.
The quality of software is the quality of the thoughts put into it by everyone who contributed to it. If the thinking diminishes over time due to turnover, the quality will follow suit. If the original authors lack the abilities or understanding, the quality will follow suit.
So we can effectively mark out zero quality as being any set of random permutations that maximizes the incorrect behaviors, or bugs, as we like to call them.
But we can also go the other way and say that a very small set of permutations that makes reasonable behavioral tradeoffs while converging very close to zero deficiencies (both in the code itself and in its behavior) is the highest achievable quality. You can only achieve high quality if you’ve taken the time to really understand each and every aspect of what behavior is necessary. The understanding of the authors would have to be nearly full and complete, with no blind spots. That is a huge amount of knowledge, which takes a long time to acquire, and needs a group of people to hold and apply, which is why we don’t see software at that high quality level very often.
We value artwork correctly, though. A particular gifted artist’s work is not the value of the canvas, the frame, and the pigments applied. It is all that went into the artist's life that drove them to express their feelings into a particular painting. The Mona Lisa is a small canvas, but has great value, well beyond its physical presence.
Code is the same way. A talented and super knowledgeable group of people can come together to craft something deep and extremely useful. Its usefulness and value go far beyond the code; it comes from the thoughts that were built up in order to bring it into existence.
When that is forgotten, people stop trying to think deeply, and the quality plummets as a direct result. Thought is valuable, code is just proof that it happened.
It is possible that when they are executed, good things will happen, but the odds of that are infinitesimally small.
If you need a computer to do anything that is beyond trivial, then you will need a lot of carefully constructed instructions to make it succeed.
You could try to iterate your way into getting these instructions by experimentation, using trial and error. For all of the earlier iterations just before the final successful one, though, some amount of the included instructions will essentially be random, so as initially stated, the odds that you blunder into the right instructions are tiny.
Instead, even if you are doing some experimentation, you are doing that to build up an internal understanding of how the instructions relate back to the behaviors of the computer. You are building a mental model of how those instructions work.
To be good at programming, you end up having to be good at acquiring this knowledge and using it to quickly build up models. You have to think very carefully about what you are seeing, how it behaves, and what you’d prefer it to have done instead.
These thoughts allow you to build up an understanding that is then manifested as code, which are the instructions given to the computer.
Which is to say that ‘coding’ isn’t the effort, thinking is. Coding is the output from acquiring an understanding of the problem and a possible solution to it. The software is only as good as the thoughts put into it.
If you approach the work too shallowly, then the software will not fit all of the expected behaviours. If the problems to be solved are deep and complex, then the knowledge needed to craft a good solution will also be deep and complex.
We see and acknowledge the value of the existing code, essentially as a form of intellectual property, but we are not properly valuing the knowledge, skills, time, and deep thinking that are necessary to have created such code. Software is only as good as the understanding of the programmers who created it. If they are clueless, the software is close to random. If they only understand a little part of what they are doing, the missing knowledge is getting randomized.
The quality of software is the quality of the thoughts put into it by everyone who contributed to it. If the thinking diminishes over time due to turnover, the quality will follow suit. If the original authors lack the abilities or understanding, the quality will follow suit.
So we can effectively mark out zero quality as being any set of random permutations that maximizes the incorrect behaviors, or bugs, as we like to call them.
But we can also go the other way and say that a very small set of permutations that makes reasonable behavioral tradeoffs while converging very close to zero deficiencies (both in the code itself and in its behavior) is the highest achievable quality. You can only achieve high quality if you’ve taken the time to really understand each and every aspect of what behavior is necessary. The understanding of the authors would have to be nearly full and complete, with no blind spots. That is a huge amount of knowledge, which takes a long time to acquire, and needs a group of people to hold and apply, which is why we don’t see software at that high quality level very often.
We value artwork correctly, though. A particular gifted artist’s work is not the value of the canvas, the frame, and the pigments applied. It is all that went into the artist's life that drove them to express their feelings into a particular painting. The Mona Lisa is a small canvas, but has great value, well beyond its physical presence.
Code is the same way. A talented and super knowledgeable group of people can come together to craft something deep and extremely useful. Its usefulness and value go far beyond the code; it comes from the thoughts that were built up in order to bring it into existence.
When that is forgotten, people stop trying to think deeply, and the quality plummets as a direct result. Thought is valuable, code is just proof that it happened.
Thursday, September 25, 2025
Nexus Point
There have been these moments, over the last four decades, where the winds of change were wafting throughout the software industry.
They are not big, dramatic events, but just these little ripples that are the harbinger of change.
One day, my university roommate showed me this plugin for Emacs that he had gotten from somewhere. It displayed “hypertext” from some server in Europe. Pretty browsers came later, and then the web descended on us like a tsunami, dragging computers out of basements for all to see.
Some guy named Steve finally showed off all of the tech he liked crammed into one tiny package; small enough to be convenient, but useful enough to be addictive.
I bought this interesting textbook about patterns, then watched as it morphed the cool new programming language into being the next legacy code generator.
I read this newly published manifesto about how not to get lost in bureaucracy, only to watch it spawn off its own aggressive and bizarre cult, making the bureaucracy the good old days.
Movements in software follow the snowball trend. There are slight indications that something is up, but then, as it rolls downhill, it picks up speed and size, until it comes slamming down onto everyone unsuspecting below.
Seems like we are back at one of those moments. The winds of change are wafting again.
It’s not AI, though; that’s just a cute mechanical trick, occasionally impressive, but far too erratic to be reliable. It’s in the wake.
Silver bullets come and go regularly, but this time it seems like this one is finally forcing us to be more honest about how we build code. The myths and games that clouded the past are quickly getting dispersed.
We’ve spent decades running away from the truth; programming is not a magic art form. It is something people can do; we can train them to do it, and they could be doing a much better job at coding someday than they are doing right now. Maybe LLMs are the nail guns of our industry, or maybe it is just a passing fad, but either way, it highlights the parts of programming that are not creative.
Most of what we do is grinding out pretty basic code, and once the larger directions are established, it is just work. Code is at its best when it is boring, predictable, readable, and doesn’t wantonly waste resources. Clear and organized. Pedantic with lots of tedious attention paid to the details.
There is and will always be some code that is super special, and we know that leveraging a powerful abstraction will lift the game to the next level, but all of the other stuff that surrounds that small amount of code is just not that interesting. Boring enough that a clever mechanical process can emulate us grinding it out.
What we do with this understanding is the big question. Will we get introspective and figure out how to make most coding a more reliable, trustworthy, and deterministic pursuit, or will we continue to hide in delusions of grandeur? Will we creatively ignore most of the details, or will we use this new knowledge to refine how we approach the work? There is a chance here that we can lift software development up to a new level, which is important given how reliant we have become on the stuff.
They are not big, dramatic events, but just these little ripples that are the harbinger of change.
One day, my university roommate showed me this plugin for Emacs that he had gotten from somewhere. It displayed “hypertext” from some server in Europe. Pretty browsers came later, and then the web descended on us like a tsunami, dragging computers out of basements for all to see.
Some guy named Steve finally showed off all of the tech he liked crammed into one tiny package; small enough to be convenient, but useful enough to be addictive.
I bought this interesting textbook about patterns, then watched as it morphed the cool new programming language into being the next legacy code generator.
I read this newly published manifesto about how not to get lost in bureaucracy, only to watch it spawn off its own aggressive and bizarre cult, making the bureaucracy the good old days.
Movements in software follow the snowball trend. There are slight indications that something is up, but then, as it rolls downhill, it picks up speed and size, until it comes slamming down onto everyone unsuspecting below.
Seems like we are back at one of those moments. The winds of change are wafting again.
It’s not AI, though; that’s just a cute mechanical trick, occasionally impressive, but far too erratic to be reliable. It’s in the wake.
Silver bullets come and go regularly, but this time it seems like this one is finally forcing us to be more honest about how we build code. The myths and games that clouded the past are quickly getting dispersed.
We’ve spent decades running away from the truth; programming is not a magic art form. It is something people can do; we can train them to do it, and they could be doing a much better job at coding someday than they are doing right now. Maybe LLMs are the nail guns of our industry, or maybe it is just a passing fad, but either way, it highlights the parts of programming that are not creative.
Most of what we do is grinding out pretty basic code, and once the larger directions are established, it is just work. Code is at its best when it is boring, predictable, readable, and doesn’t wantonly waste resources. Clear and organized. Pedantic with lots of tedious attention paid to the details.
There is and will always be some code that is super special, and we know that leveraging a powerful abstraction will lift the game to the next level, but all of the other stuff that surrounds that small amount of code is just not that interesting. Boring enough that a clever mechanical process can emulate us grinding it out.
What we do with this understanding is the big question. Will we get introspective and figure out how to make most coding a more reliable, trustworthy, and deterministic pursuit, or will we continue to hide in delusions of grandeur? Will we creatively ignore most of the details, or will we use this new knowledge to refine how we approach the work? There is a chance here that we can lift software development up to a new level, which is important given how reliant we have become on the stuff.
Thursday, September 18, 2025
Codebase Organization
A messy working environment is a huge amount of unnecessary friction. The worse it is, the harder it becomes to do things. It slows everything down and degrades the quality of the output. Digital work environments are no different than physical ones.
Like any other profession, software developers need to keep their workspaces tidy. Their primary output is the codebases they are building. So their workspaces are usually the code, its artifacts, the builds on their workstations, the backups to source code control repositories, and the deployments to test environments. All this is necessary before being able to release software to any operational environments.
Organization is three things: a) a place for everything, b) everything in its place, and c) not too many similar things all in the same place.
We primarily work on code and configuration data. We’ll generally use at least two different programming languages, one as primary, the other for builds and automation. There are often secondary, indirect, but related resources for complex issues like persistence handling.
A place for everything means that if you have some new code or data, you know exactly where it should go. It’s not open for discussion; there aren't any choices. It exists, everyone knows about it, and there is just one place for it to go.
Organization is zero freedom. If you don’t put things in their place, it is disorganized. If you do that enough, it is a mess, and it becomes increasingly harder to find or deal with the stuff you already have. Creating new stuff in a huge, disorganized mess just makes the mess worse; it does not fix the problem.
That place is dictated by the architecture, which lays down a structure for all of the code and included artifacts. It is specific: code X belongs in file Y, in directory Z. If there is doubt or ambiguity as to the place, it is disorganized.
More importantly, if you have duplicate versions of code X, and they are located in two different places, this is disorganized. One of them is in the wrong place. Duplicate code is a direct form of disorganization.
Naming is considered hard, but it is because many programmers believe that there is a lot of freedom possible. However, the name itself is also an artifact, and so it has a specific place too. That place is dictated by the naming conventions. To be organized, you need the naming conventions, and they should be explicit about where you ‘place’ the names. This not only includes any external references, but also variables, functions, etc. Every name in all of the code and the artifacts. Comments are a little outside of this, as they should be optional, extra knowledge that is not obvious from the code, the artifacts, the architecture, or the naming convention.
Again, if you are organized, you’d never end up with the same ‘thing’ called two different names, as one of those names is wrong. Good naming isn’t just an attribute of readability; it is also a big part of staying organized. Bad, inconsistent naming is a visible form of disorganization.
This is wonderful and all, but in actual practice, strict organization takes such a huge amount of time, and we are generally rushed while working. So, things will get messy -- disorganized -- but it's very, very important to stop, every so often, and do some clean up. You can’t let the mess win, and you get more time back from cleaning up messy stuff than you lose doing it.
Cleanup is just refactoring. Moving things around to put them back into the places where they should have been originally. For some stuff, that might first mean deciding on the ‘place’ and then sticking to it consistently for all things in the codebase. It is essentially non-destructive (unless there are architectural or domain problems that get dragged in) and really is just moving things a little closer to being better organized.
Decide on a place for ‘things’ that you ignored earlier. Find that stuff and put it into its place. If one place ends up with too many slightly different things, break it up into two or more places.
If you keep doing this regularly, the code will converge on being well-written. If you don’t do it or are not allowed to do it, the mess will continue to grow, and grow, and grow until the friction becomes a hurricane.
Like any other profession, software developers need to keep their workspaces tidy. Their primary output is the codebases they are building. So their workspaces are usually the code, its artifacts, the builds on their workstations, the backups to source code control repositories, and the deployments to test environments. All this is necessary before being able to release software to any operational environments.
Organization is three things: a) a place for everything, b) everything in its place, and c) not too many similar things all in the same place.
We primarily work on code and configuration data. We’ll generally use at least two different programming languages, one as primary, the other for builds and automation. There are often secondary, indirect, but related resources for complex issues like persistence handling.
A place for everything means that if you have some new code or data, you know exactly where it should go. It’s not open for discussion; there aren't any choices. It exists, everyone knows about it, and there is just one place for it to go.
Organization is zero freedom. If you don’t put things in their place, it is disorganized. If you do that enough, it is a mess, and it becomes increasingly harder to find or deal with the stuff you already have. Creating new stuff in a huge, disorganized mess just makes the mess worse; it does not fix the problem.
That place is dictated by the architecture, which lays down a structure for all of the code and included artifacts. It is specific: code X belongs in file Y, in directory Z. If there is doubt or ambiguity as to the place, it is disorganized.
More importantly, if you have duplicate versions of code X, and they are located in two different places, this is disorganized. One of them is in the wrong place. Duplicate code is a direct form of disorganization.
Naming is considered hard, but it is because many programmers believe that there is a lot of freedom possible. However, the name itself is also an artifact, and so it has a specific place too. That place is dictated by the naming conventions. To be organized, you need the naming conventions, and they should be explicit about where you ‘place’ the names. This not only includes any external references, but also variables, functions, etc. Every name in all of the code and the artifacts. Comments are a little outside of this, as they should be optional, extra knowledge that is not obvious from the code, the artifacts, the architecture, or the naming convention.
Again, if you are organized, you’d never end up with the same ‘thing’ called two different names, as one of those names is wrong. Good naming isn’t just an attribute of readability; it is also a big part of staying organized. Bad, inconsistent naming is a visible form of disorganization.
This is wonderful and all, but in actual practice, strict organization takes such a huge amount of time, and we are generally rushed while working. So, things will get messy -- disorganized -- but it's very, very important to stop, every so often, and do some clean up. You can’t let the mess win, and you get more time back from cleaning up messy stuff than you lose doing it.
Cleanup is just refactoring. Moving things around to put them back into the places where they should have been originally. For some stuff, that might first mean deciding on the ‘place’ and then sticking to it consistently for all things in the codebase. It is essentially non-destructive (unless there are architectural or domain problems that get dragged in) and really is just moving things a little closer to being better organized.
Decide on a place for ‘things’ that you ignored earlier. Find that stuff and put it into its place. If one place ends up with too many slightly different things, break it up into two or more places.
If you keep doing this regularly, the code will converge on being well-written. If you don’t do it or are not allowed to do it, the mess will continue to grow, and grow, and grow until the friction becomes a hurricane.
Wednesday, September 10, 2025
Manifestations
The only two things in a computer are code and data.
Code is a list of instructions for a computer to follow. Data is a symbolic encoding of bits that represents something else.
In the simplest of terms, code is a manifestation of what a programmer knew when they wrote it. It’s a slight over-simplification, but not too far off.
More precisely, some code and some configuration data come directly from a programmer’s understanding.
There could be generated code as well. But in an oddball sense, the code that generated that code was the manifestation, so it is still there.
Any data in the system that has not been ‘collected’ is configuration data. It was understood and placed there by someone.
These days, most code comes from underlying dependencies. Libraries, frameworks, other systems, and products. Interactions with these are glued into the code. The glue code is the author’s understanding, and the dependency code is the understanding of all of the other authors who worked on it.
Wherever and however we boil it down, it comes down to something that some person understood at some point. Code does not spontaneously generate. At least not yet.
The organization and quality of the code come directly from its author. If they are disorganized, the code is disorganized. If they are confused, the code is confused. If they were rushed, the code is weak. The code is what they understand and are able to assemble as instructions for the computer to follow.
Computers are essentially deterministic machines, but the output of code is not guaranteed to be deterministic. There are plenty of direct and indirect ways of injecting non-determinism into code. Determinism is a highly valuable property; you really want it in code, where possible, because it is the anchor property for nearly all users' expectations. If the author does not understand how to do this, the code will not be deterministic, and it is far too easy to make mistakes.
That code is so closely tied to the understandings of its authors that it has a lot of ramifications. The most obvious is that if you do not know something, you cannot write code to accomplish it. You can’t because you do not know what that code should be.
You can use code from someone else who knows, but if there are gaps in their knowledge or it doesn’t quite apply to your situation, you cannot really fix it. You don’t know how to fix it. You can patch over the bad circumstances that you’ve found, but if they are just a drop in a very large bucket, they will keep flowing.
As a consequence, the combined output from a large group of novice programmers will not exceed their individual abilities. It doesn’t matter how many participate; it is capped by understanding. They might be able to glue a bunch of stuff together, as learning how to glue things is a lesser skill than coding them, but all of the risks associated with those dependencies are still there and magnified by the lack of knowledge.
As mentioned earlier, a code generator is just a second level of indirection for the coding issues. It still traces back to people. Any code constructed by any automated process has the same problem, even if that process is sophisticated. Training an LLM to be a dynamic, but still automated, process does not escape this limitation. The knowledge that flowed into the code just comes from more sources, is highly non-deterministic, and rather obviously has even more risk. It’s the same as adding more novice programmers into the mix; it just amplifies the problems. Evidently, we are told that getting enough randomly typing monkeys on typewriters could generate Shakespeare, but that says nothing about the billions of monkeys you’ll need to do it, nor the effort to find that elusive needle in a rather massive haystack. It’s a tree falling in a forest with no one around.
For decades, there have been endless silver bullets launched in an attempt to separate code and configuration data away from the people who need to understand it. As Frederick P. Brooks pointed out in the 1970s, it is not possible. Someone has to issue the instructions, and they cannot do that if they don’t understand them. The work in building software is acquiring that understanding; the code is just the manifestation of that effort. If you don’t do the work, you will not get the software. If you get rid of the people who did the work, you will not be able to continue the work.
Code is a list of instructions for a computer to follow. Data is a symbolic encoding of bits that represents something else.
In the simplest of terms, code is a manifestation of what a programmer knew when they wrote it. It’s a slight over-simplification, but not too far off.
More precisely, some code and some configuration data come directly from a programmer’s understanding.
There could be generated code as well. But in an oddball sense, the code that generated that code was the manifestation, so it is still there.
Any data in the system that has not been ‘collected’ is configuration data. It was understood and placed there by someone.
These days, most code comes from underlying dependencies. Libraries, frameworks, other systems, and products. Interactions with these are glued into the code. The glue code is the author’s understanding, and the dependency code is the understanding of all of the other authors who worked on it.
Wherever and however we boil it down, it comes down to something that some person understood at some point. Code does not spontaneously generate. At least not yet.
The organization and quality of the code come directly from its author. If they are disorganized, the code is disorganized. If they are confused, the code is confused. If they were rushed, the code is weak. The code is what they understand and are able to assemble as instructions for the computer to follow.
Computers are essentially deterministic machines, but the output of code is not guaranteed to be deterministic. There are plenty of direct and indirect ways of injecting non-determinism into code. Determinism is a highly valuable property; you really want it in code, where possible, because it is the anchor property for nearly all users' expectations. If the author does not understand how to do this, the code will not be deterministic, and it is far too easy to make mistakes.
That code is so closely tied to the understandings of its authors that it has a lot of ramifications. The most obvious is that if you do not know something, you cannot write code to accomplish it. You can’t because you do not know what that code should be.
You can use code from someone else who knows, but if there are gaps in their knowledge or it doesn’t quite apply to your situation, you cannot really fix it. You don’t know how to fix it. You can patch over the bad circumstances that you’ve found, but if they are just a drop in a very large bucket, they will keep flowing.
As a consequence, the combined output from a large group of novice programmers will not exceed their individual abilities. It doesn’t matter how many participate; it is capped by understanding. They might be able to glue a bunch of stuff together, as learning how to glue things is a lesser skill than coding them, but all of the risks associated with those dependencies are still there and magnified by the lack of knowledge.
As mentioned earlier, a code generator is just a second level of indirection for the coding issues. It still traces back to people. Any code constructed by any automated process has the same problem, even if that process is sophisticated. Training an LLM to be a dynamic, but still automated, process does not escape this limitation. The knowledge that flowed into the code just comes from more sources, is highly non-deterministic, and rather obviously has even more risk. It’s the same as adding more novice programmers into the mix; it just amplifies the problems. Evidently, we are told that getting enough randomly typing monkeys on typewriters could generate Shakespeare, but that says nothing about the billions of monkeys you’ll need to do it, nor the effort to find that elusive needle in a rather massive haystack. It’s a tree falling in a forest with no one around.
For decades, there have been endless silver bullets launched in an attempt to separate code and configuration data away from the people who need to understand it. As Frederick P. Brooks pointed out in the 1970s, it is not possible. Someone has to issue the instructions, and they cannot do that if they don’t understand them. The work in building software is acquiring that understanding; the code is just the manifestation of that effort. If you don’t do the work, you will not get the software. If you get rid of the people who did the work, you will not be able to continue the work.
Subscribe to:
Posts (Atom)