Thursday, February 19, 2026

Data Collection

One of the strongest abilities of any software is data collection. Computers are stupid, but they can remember things that are useful.

It’s not enough to just have some widgets display it on a screen. To collect data means that it has to be persisted for the long term. The data survives the programming being run and rerun, over and over again.

But it’s more than that. If you collect data that you don’t need, it is a waste of resources. If you don’t collect the data that you need, it is a bug. If you keep multiple copies of the same data, it is a mistake. The software is most useful when it always collects just what it needs to collect.

And it matters how you represent that data. Each individual piece of data needs to be properly decomposed. That is, if it is two different pieces of information, it needs to be collected into two separate data fields. You don’t want to over-decompose, and you don’t want to clump a bunch of things together.

Decomposition is key because it allows the data to be properly typed. You don’t want to collect an integer as a string; it could be misinterpreted. You don’t want a bunch of fields clumped together as unstructured text. Data in the wrong format opens up the door for it to be misinterpreted as information, causing bugs. You don’t want mystery data, each datam should have a self-describing label that is unambiguous. If you collect data that you can not interpret correctly, then you have not collected that information.

If you have the data format correct, then you can discard invalid junk as you are collecting it. Filling a database with junk is collecting data you don’t need, and if you did that instead of getting the data you did need, it is also a bug.

Datam are never independent. You need to collect data, and that data has a structure that binds together all of the underlying datam correctly. If you downgrade that structure, you have lost the information about it. If you put the data into a broader structure, you have opened up the possibility of it getting filled with junk data. For example, if the relationship between the data is a hierarchical tree, then the data needs to be collected as a tree; neither a list nor a graph is a valid collection.

In most software, most of the data is intertwined with other values. If you started with one specific piece of data, you should be able to quickly navigate to any of the others. That means that you have collected all of the structures and interconnections properly, and you have not lost any of them. There should only be one way to navigate, or you have collected redundant connections.

As such, if you have collected all of the data you need, then you can validate it. There won’t be data that is missing, there won’t be data that is junk. You can write simple validations that will ensure that the software is working properly, as expected. If the validations are difficult, then there is a problem with the data collection.

If you collect all of the data you need for the software correctly, then writing the code on top of it is way simpler and far easier to properly structure. The core software gets the data from persistence, then passes it out to some form of display. It may come back with some edits, which need to be updated in the persistence. There may be some data that you did not collect, but the data you did collect is enough to be able to derive it from a computation. There may be tricky technical issues that are necessary to support scaling, but those are independent from the collection and flow of data.

Collecting data is the foundation of almost all software. If you get it right, you will be able to grow the software to gradually cover larger parts of the problem domain. If you make a mess out of it, the code will get really ugly, and the software will be unreliable.

Thursday, February 12, 2026

Blockers

Some days the coding goes really smoothly. You know what you need; you lay out a draft version, which happens nicely. It kinda works. You pass over it a bunch of times to bang it properly into position. A quick last pass to enhance its readability for later, and then out the door it goes.

Sometimes, there is ‘friction’. You start coding, but you have to keep waiting on other things. So, it’s code a bit, set it aside, code a bit, etc. The delays can be small, but they add up and interfere with the concentration and sense of accomplishment.

Some friction comes from missing analysis. There was something you should have known, but it fell through the cracks. Some comes from interactions with others. You need something from your teammates, or you need it from some other external group.

With some issues for external groups, it will take lots of time to escalate it, arrange introductory meetings, get to the issue, and then finally come to a resolution. You can kinda fake the code a little in the meantime, but that is usually throw-away work, so you’d prefer to minimize it. If you are patient, it will eventually get done.

Occasionally, though, there is a ‘blocker’. It is unpassable. You started to work on something, but it was shut down. You are no longer able to work on it. It’s a dead end.

One type of blocker is that someone else is doing the same work. You were going to write something, but it turns out they got there first or have some type of priority. In some cases, that is fine, but sometimes you feel that you could have done a much better job at the effort, which is frustrating. Their code is limiting.

Another type is knowledge-based. You need something, but it is far too complex or time-consuming for others to let you write it.

Some code is straightforward. But some code requires buckets of very specific knowledge first, or the code will become a time sink. People might stop you from writing systems programming components like persistence, or domain-specific languages, or synchronization, for example. Often, that morphs into a buy-versus-build decision. So something similar exists; you feel you could do it yourself, but they purchase it instead, and the effort to integrate it is ugly. If you don’t already have that knowledge, you dodged a bullet, but if you do have it, it can be very frustrating to watch a lesser component get added into the mix when it could have been avoided with just a bit of time.

There are fear-based blockers as well. People get worried that doing something a particular way may just be another time sink, so they stop it quickly. That is often the justification for brute force style coding, for example. They’d rather run hard and pound it all out as a mess than step back and work through it in a smart way. In some shops, the only allowable code is glue, since they are terrified of turnover.

In that sense, blockers are usually about code. You have it, you need it, where is it going to come from? Are you allowed to write something or not? With knowledge, you can usually do the work to figure it out, or at least approximate it, but there could be some secret knowledge that you really need to move forward, but are fully blocked from getting it, although that is extremely rare.

If you flip that around, when you're building a medium-sized or larger system, the big issue is where is the code for it going to come from? In that sense, building software is the work of getting all of the code you need together in one organized place. Some of it exists already, some of it you have to create yourself.

In the past, the biggest concern about pre-existing code was always ‘support’. You don’t want to build on some complex component only to have it crumble on you, and there is nothing you can do about it. That is an expensive mistake. So, if you aren’t going to write it yourself, then who is going to support it, and how good is that support?

If you follow that, then you generally come to understand that as you build up all of this code, support is crucial. It’s not optional, and it is foolish to assume the code is bug-free and will always work as expected.

It’s why old programmers like to pound out a lot of stuff themselves; they know when doing that, they can support their own code, and they know that that doesn’t waver until they leave the project. The support issue is resolved.

It’s also why most wise programmers don’t just add in any old library. They’ve had issues with little dodgy libraries that were poorly supported in the past, so they have learned to avoid them. Big, necessary components are unavoidable, but the little odd ones are not. If you can’t find a legitimate version of something, doing it yourself is a much better choice.

Which brings us all of the way around to vibe coding. If you’ve been around a while, then nothing seems like a worse idea than having the ability to dynamically generate unsupported code. Tonnes of it.

Particularly if it is complex and somewhat unlimited in depth.

A whack load of boilerplate might be okay; at least you can read and modify it, although a debugger would still likely be necessary to highlight the problem, so it can mean a lot of work recreating the issue. So, it might only be a short-term time saver, but a nasty landmine waiting for later. Supportable, but costly.

But it would be heartbreaking to generate 100K in code, which is almost usable but entirely unsupportable. If you did it in a week, you’d probably just have to live with the flaws or spend years trying to pound out the bugs.

Not surprisingly, people tried this often in the past. They built sophisticated generators, hit the button and got full, ready-to-go applications. You don’t see any of these around anymore, since the support black holes they formed consumed them and everything else around them, so they essentially eradicated the evidence of their existence. It was tried, and it failed miserably.

But even more interesting was that those older application generators were at least deterministic. You could run them ten times, and mostly get back the same code. With vibe coding, each run is a random turkey shoot. You’ll get something different. So, extra unsupportable, and extra crazy.

If you are going to build a big system to solve a complex problem, then you need to avoid any and all blockers that get in your way. Friction can slow you down, but a blocker is often fatal.

These days, you’re not really ‘writing’ the system, so much as you are ‘assembling it’. If you do that from too many unsupportable subparts, then the whole will obviously be unsupportable. Inevitably, if you put something into a production environment, you either have to be prepared to support it somehow or move on to the next gig. But if too much unsupportable crud gets out there, that next gig may be even worse than the one that you tried to flee from.

Thursday, February 5, 2026

Systems Thinking

There are two main schools of thought in software development about how to build really big, complicated stuff.

The most prevalent one, these days, is that you gradually evolve the complexity over time. You start small and keep adding to it.

The other school is that you lay out a huge specification that would fully work through all of the complexity in advance, then build it.

In a sense, it is the difference between the way an entrepreneur might approach doing a startup versus how we build modern skyscrapers. Evolution versus Engineering.

I was working in a large company a while ago, and I stumbled on the fact that they had well over 3000 active systems that were covering dozens of lines of business and all of the internal departments. It had evolved this way over fifty years, and included lots of different tech stacks, as well as countless vendors. Viewed as ‘one’ thing it was a pretty shaky house of cards.

It’s not hard to see that if they had a few really big systems, then a great number of their problems would disappear. The inconsistencies between data, security, operations, quality, and access were huge across all of those disconnected projects. Some systems were up-to-date, some were ancient. Some worked well, some were barely functional. With way fewer systems, a lot of these self-inflicted problems would just go away.

It’s not that you could cut the combined complexity in half, but more likely that you could bring it down to at least one-tenth of what it is today, if not even better. It would function better, be more reliable, and would be far more resilient to change. It would likely cost far less and require fewer employees as well. All sorts of ugly problems that they have now would just not exist.

The core difference between the different schools really centers around how to deal with dependencies.

If you had thousands of little blobs of complexity that were all entirely independent, then getting finished is just a matter of banging out each one by itself until they are all completed. That’s the dream.

But in practice, very few things in a big ecosystem are actually independent. That’s the problem.

If you are going to evolve a system, then you ignore these dependencies. Sort them out afterwards, as the complexity grows. It’s faster, and you can get started right away.

If you were going to design a big system, then these dependencies dictate that design. You have to go through each one and understand them all right away. They change everything from the architecture all the way down to the idioms and style in the code.

But that means that all of the people working to build up this big system have to interact with each other. Coordinate and communicate. That is a lot of friction that management and the programmers don’t want. They tend to feel like it would all get done faster if they could just go off on their own. And it will, in the short-term.

If you ignore a dependency and try to fix it later, it will be more expensive. More time, more effort, more thinking. And it will require the same level of coordination that you tried to avoid initially. Slightly worse, in that the time pressures of doing it correctly generally give way to just getting it done quickly, which pumps up the overall artificial complexity. The more hacks you throw at it, the more hacks you will need to hold it together. It spirals out of control. You lose big in the long-term.

One of the big speed bumps preventing big up-front designs is a general lack of knowledge. Since the foundations like tech stacks, frameworks, and libraries are always changing rapidly these days, there are few accepted best practices, and most issues are incorrectly believed to be subjective. They’re not, of course, but it takes a lot of repeated experience to see that.

The career path of most application programmers is fairly short. In most enterprises, the majority have five years or less of real in-depth experience, and battle-scared twenty-year+ vets are rare. Mostly, these novices are struggling through early career experiences, not ready yet to deal with the unbounded, massive complexity present in a big design.

Also, the other side of it is that evolutionary projects are just more fun. I’ve preferred them. You’re not loaded down with all those messy dependencies. Way fewer meetings, so you can just get into the work and see how it goes. Endlessly arguing about fiddly details in a giant spec is draining, made worse if the experience around you is weak.

Evolutionary projects go very badly sometimes. The larger they grow, the more likely they will derail. And the fun gives way to really bad stress. That severe last-minute panic that comes from knowing that the code doesn't really work as it should, and probably never will. And the longer-term dissatisfaction of having done all that work to ultimately just contribute to the problem, not actually fix it.

Big up-front designs are often better from a stress perspective. A little slow to start and sometimes slow in the middle, they mostly smooth out the overall development process. You’ve got a lot of work to do, but you’ve also got enough time to do it correctly. So you grind through it, piece by piece, being as attentive to the details as possible. Along the way, you actively look for smarter approaches to compress the work. Reuse, for instance, can shave a ton of code off the table, cut down on testing, and provide stronger certainty that the code will do the right thing in production.

The fear that big projects will end up producing the wrong thing is often overstated. It’s true for a startup, but entirely untrue for some large business application for a market that’s been around forever. You don’t need to burn a lot of extra time, breaking the work up into tiny fragments, unless you really don’t have a clue what you are building. If you're replacing some other existing system, not only do you have a clue, you usually have a really solid long-term roadmap. Replace the original work and fix its deficiencies.

There should be some balanced path in the middle somewhere, but I haven’t stumbled across a formal version of it after all these decades.

We could go first to the dependencies, then come up with reasons why they can be temporarily ignored. You can evolve the next release, but still have a vague big design as a long-term plan. You can refactor the design as you come across new, unexpected dependencies. Change your mind, over and over again, to try to get the evolved works to converge on a solid grand design. Start fast, slow right down, speed up, slow down again, and so forth. The goal is one big giant system to rule them all, but it may just take a while to get there.

The other point is that the size of the iterations matters, a whole lot. If they are tiny, it is because you are blindly stumbling forward. If you are not blindly stumbling forward, they should be longer, as it is more effective. They don’t have to all be the same size. And you really should stop and take stock after each iteration. The faster people code, the more cleanup that is required. The longer you avoid cleaning it up, the worse it gets, on basically an exponential scale. If you run forward like crazy and never stop, the working environment will be such a swamp that it will all grind to an abrupt stop. This is true in building anything, or even cooking in a restaurant. Speed is a tradeoff.

Evolution is the way to avoid getting bogged down in engineering, but engineering is the way to ensure that the thing you build really does what it is supposed to do. Engineering is slow, but spinning way out of control is a heck of a lot slower. Evolution is obviously more dynamic, but it is also more chaotic, and you have to continually accept that you’ve gone down a bad path and need to backtrack. That is hard to admit sometimes. For most systems, there are parts that really need to be engineered, and parts that can just be allowed to evolve. The more random the evolutionary path, the more stuff you need to throw away and redo. Wobbling is always expensive. Nature gets away with this by having millions of species, but we really only have one development project, so it isn’t particularly convenient.

Thursday, January 29, 2026

Reap What You Sow

When I first started programming, some thirty-five years ago, it was a somewhat quiet, if not shy, profession. It had already been around for a while, but wasn’t visible to the public. Most people had never even seen a serious computer, just the overly expensive toys sold at department stores.

Back then, to get to an intermediate position took about 5 yrs. That would enable someone to build components on their own. They’d get to senior around 10 yrs, where they might be expected to create a medium-sized system by themselves, from scratch. 20 yrs would open up lead developer positions for large or huge projects, but only if they had the actual experience to back it up. Even then, a single programmer might spend years to get their code size up to medium, so building larger systems required a team.

Not only did the dot-com era rip programming from the shadows, but the job positions also exploded. Suddenly, everyone used computers, everyone needed programmers, and they needed lots of them. That disrupted the old order, but never really replaced it with anything sane.

So, we’d see odd things like someone getting promoted to a senior position after just 2 or 3 years of working; small teams of newbie programmers getting tasked with building big, complex systems with zero guidance; various people with no significant coding experience hypothesising about how to properly build stuff. It’s been a real mess.

For most computer systems, to build them from scratch takes an unreasonably large amount of knowledge and skill, not only about the technical aspects, but also the domain problems and the operational setups, too.

The fastest and best way to gain that knowledge is from mentoring. Courses are good for the basics, but the practice of keeping a big system moving forward is often quite non-intuitive, and once you mix in politics and bureaucracy, it is downright crazy.

If you spend years in the trenches with people who really get what they are doing, you come up to speed a whole lot faster.

We have a lot of stuff documented in books and articles, and some loose notions of best practices, but that knowledge is often polluted with trendy advice, so people bend it incorrectly to help them monetize stuff.

There has always been a big difference between what people say should be done and what they actually do successfully.

Not surprisingly, after a couple of decades of struggling to put stuff together, the process knowledge is nearly muscle memory and somewhat ugly. It’s hard to communicate, but you’ve learned what has really worked versus what just sounds good. That knowledge is passed mouth to mouth, and it’s that knowledge that you really want to learn from a mentor. It ain’t pretty, but you need it.

As a consequence, it is no surprise that the strongest software development shops have always had a good mix of experience. It is important. Kids and hardened vets, with lots of people in the middle. It builds up a good environment and follows loosely from the notion that ‘birds of a feather flock together’. That type of experience diversity is critical, and when it comes together, the shop can smoothly build any type of software it needs to build. Talent attracts talent.

That’s why when we see it crack, and there is a staffing landslide, where a bunch of experienced devs all leave at the same time, it often takes years or decades to recover. Without a strong culture of learning and engineering, it’s hard to attract and keep good people; it’s understaffed, and the turnover is crazy high.

There are always more programming jobs than qualified programmers; it seems that never really changes.

Given that has been an ongoing problem in the industry for half a century, we can see how AI may make it far worse. If companies stop hiring juniors because their intermediates are using AI to whack out that junior-level code, that handoff of knowledge will die. As the older generations leave without passing on any process knowledge, it will eventually be the same as only hiring a bunch of kids with no guidance. AI won’t help prevent that, and its output will be degraded from training on those fast-declining standards.

We’ve seen that before. One of the effects of the dot-com era was that the lifespan of code shrank noticeably. The older code was meant to run for decades; the new stuff is often just replaced within a few years after it was written. That’s part of why we suddenly needed more programmers, but also why the cost of programming got worse. It was offset somewhat by having more libraries and frameworks available, but because they kept changing so fast, they also helped shorten the lifespan. Coding went from being slowly engineered to far more of a performance art. The costs went way up; the quality declined.

If we were sane, we’d actually see the industry go the other way.

If we assume that AI is here to stay for coders, then the most rational thing to do would be to hire way more juniors, and let them spend lots of time experimenting and building up good ways to utilize these new AI tools, while also getting a chance to learn and integrate that messy process knowledge from the other generations. So instead of junior positions shrinking, we’d see an explosion of new junior positions. And we’d see vets get even more expensive.

That we are not see this indicates either a myopic management effect or that AI itself really isn’t that useful right now. What seems to be happening is that management is cutting back on payroll long before the intermediates have successfully discovered how to reliably leverage this new toolset. They are jumping the gun, so to speak, and wiping out their own dev shops as an accidental consequence. It will be a while before they notice.

This has happened before; software development often has a serious case of amnesia and tends to forget its own checkered history. If it follows the older patterns, there will be a few years of decreasing jobs and lower salaries, followed by an explosion of jobs and huge salary increases. They’ll be desperate to undo the damage.

People incorrectly tend to think of software development as one-off projects instead of continually running IT shops. They’ll do all sorts of short-term damage to squeeze value, while losing big overall.

Having lived through the end of programming as we know it a few dozen times already, I am usually very wary of any of these hype cycles. AI will eventually find its usefulness in a few limited areas of development, but it won’t happen until it has become far more deterministic. Essentially, random tools are useless tools. Software development is never a one-off project, even if that delusion keeps persisting. If you can’t reliably move forward, then you are not moving forward. At some point, the ground just drops out below you, sending you back to square one.

The important point at the high level is that you set up and run a shop to produce and maintain the software you need to support your organization’s goals. The health of that shop is vital, since if it is broken, you can’t really keep anything working properly. When the toolset changes, it would be good if the shop can leverage it, but it is up to the people working there to figure it out, not management.

Thursday, January 22, 2026

Dirty Little Secret

At the beginning of this Century, an incredibly successful software executive turned to me and said, “The dirty little secret of the software industry is that none of this stuff really works.”

I was horrified.

"Sure, some of that really ancient stuff didn’t work very well, but that is why we are going to replace it all with all of our flashy new technologies. Our new stuff will definitely fix that and work properly," I replied.

I’ve revisited this conversation in my head dozens of times over the decades. That new flashy stuff of my youth is now the ancient crusty stuff for the kids. It didn’t work either.

Well, to be fair, some parts of it worked just enough, and a lot of it stayed around hidden deep below. But it's weird and eclectic, and people complain about it often, try hard to avoid it, and still dream of replacing it.

History, it seems, did a great job of proving that the executive was correct. Each new generation piles another mess on top of all of the previous messes, with the dream of getting it better, this time.

It’s compounded by the fact that people rush through the work so much faster now.

In those days, we worked on something for ‘months’, now the expectation is ‘weeks’.

Libraries and frameworks were few and far between, but that actually afforded us a chance to gain more knowledge and be more attentive to the little details. The practice of coding software keeps degrading as the breadth of technologies explodes.

The bigger problem is that even though the hardware has advanced by orders and orders of magnitude, the effectiveness of the software has not. It was screens full of awkward widgets back then; it is still the same. Modern GUIs have more graphics, but behave worse than before. You can do more with computers these days, but it is far more cognitively demanding to get it done now. We didn’t improve the technology; we just started churning it out faster.

Another dirty little secret is that there is probably a much better way to code things that was probably more commonly known when it was first discovered in the 70s or the 80s. Most programmers prefer to learn from scratch, forcing them to resolve the same problems people had decades ago. If we keep reinventing the same crude mechanics, it is no surprise that we haven’t advanced at all. We keep writing the same things over and over again while telling ourselves that this time it is really different.

I keep thinking back to all of those times, in so many meetings, were someone was enthusiastically expounding the virtues of some brand new, super trendy, uber cool technology, and essentially claiming “this time, we got it right”, while knowing, that if I wait for another five years, the tides will turn and a new generation will be claiming that that old stuff doesn’t work.

“None of this stuff really works” got stuck in my head way back then, and it keeps proving itself correct.