The Programmer's Paradox: Control

I’ve often written about the importance of reusing code, but I fear that that notion in our industry has drifted far away from what I mean.

As far as time goes, the worst thing you can do as a programmer is write very similar code, over and over and over again. We’ve always referred to that as ‘brute force’. You sit at the keyboard and pound out very specific code with slight modifications. It’s a waste of time.

We don’t want to do that because it is an extreme work multiplier. If you have a bunch of similar problems, it saves orders of magnitude of time to just write it once a little generally, then leverage it for everything else.

But somehow the modern version of that notion is that instead of writing any significant code, you just pile as many libraries, frameworks, and products as you can. The idea is that you don’t write stuff, you just glue it together for construction speed. The corollary is that stuff written by other people is better than the stuff you’ll write.

The flaw in that approach is ‘control’. If you don’t control the code, then when there is a problem with that code, your life will become a nightmare. Your ‘dependencies’ may be buggy. Those bugs will always trigger at the moment you don’t have time to deal with them. With no control, there is little you can do about some low-level bug except find a bad patch for it. If you get enough bad patches, the whole thing is unstable, and will eventually collapse.

You get caught in a bad cycle of wasting all of your time on things you can’t do anything about, so you don’t have the time anymore to break out of the cycle. It just sucks you down and down and down.

The other problem is that the dependencies may go rogue. You picked them for a subset of what they do, but their developers might really want to do something else. They drift away from you, so your glue gets uglier and uglier. Once that starts, it never gets better.

In software, the ‘things’ you don’t control will always come back to haunt you. Which is why we want to control as much as possible.

So, reusing your own stuff is great, but reusing other people’s stuff has severe inherent risks.

The best way to deal with this is to write your own version of whatever you can, given the time available. That is, throwing in a trivial library just because it exists is bad. You can look at how they implemented it, and then do your own version which is better and fits properly into your codebase. In that sense, it's nice that these libraries exist, but it is far safer to use them as examples for learning than to wire them up into your code.

There are some underlying components however that are super hard to get correct. Pretty much anything that deals with persistence falls into this category, as it requires a great deal of knowledge about transactional integrity to make the mechanics fault-tolerant. If you do it wrong, you get random bugs popping up all over the place. You can’t fix a super rare bug simply because you can not replicate it, so you’d never have any certainty that your code changes did what you needed them to do. Where there is one heisenbug, there are usually lots more lurking about.

You could learn all about low-level systems programming, fault tolerance, and such, but you probably don’t have the decade available to do that right now, so you really do want to use someone else’s code for this. You want to leverage their deep knowledge and get something nearly state-of-the-art.

But that is where things get complicated again. People seem to think that ‘newer’ is always better. Coding seems to come in waves, so sometimes the newer technologies are real actual improvements on the older stuff. The authors understood the state of the art and improved upon it. But only sometimes.

Sometimes the authors ignore what is out there, have no idea what the state of the art really is, and just go all the way back to first principles to make every old mistake again. And again. There might be some slight terminology differences that seem more modern, but the underlying work is crude and will take decades to mature if it does. You really don't want to be building on anything like that. It is unstable and everything you put on top will be unstable too. Bad technology never gets better.

So, you need to add other stuff you can’t control and it is inherently hazardous.

If you pick something trendy that is also flakey, you’ll just suffer a lot of unnecessary problems. You need to pick the last good thing, not the most recent one.

That is always a tough choice, but crucial to building stable stuff. As a consequence though, it is important to know that sometimes the choice made was bad, you picked a dude. Admit it early, since it is usually cheaper to swap that for something else as early as possible.

Bad dependencies are time sinks. If you don’t control it and can’t fix it when it breaks, then at the very least you need it to be trustworthy. Which means it is reliable and relatively straightforward to use. You never need a lot of features, and in most cases, you shouldn’t need a lot of configurations either. Just stuff that does exactly what it is supposed to do, all of the time. You want it to encapsulate all of the ugliness away from you, but you also want it to deal with that ugliness correctly, not just ignore it.

If you are picking great stuff to build on, then you get more time to spend building your own stuff, and if you aren’t just retyping similar code over and over again, you can spend this time keeping your work organized and digging deeply into the problems you face. You are in control. That makes coding a whole lot more enjoyable than just rushing through splatting out endless frail code. After all, programming is about problem-solving, and we want to keep solving unique high-quality problems, not redundantly trivial and annoying ones. Your codebase should build on your knowledge and understanding. That is how you master the art.

The Programmer's Paradox

Thursday, February 13, 2025

Control

No comments:

Post a Comment