Sunday, September 18, 2022

Reinvention

Programmers are sometimes confused between reinventing a wheel and just simply rewriting it.

Let me explain…

Underneath all code is a model of operation that is often based on the interaction of sophisticated data structures. That is the core of the abstractions that drive the behavior. That is what makes the code work.

The more sophisticated the model, the more likely it works better (not always true, but more often true than not).

If you learn or read about those abstractions, then you can use that knowledge to faithfully write new implementations. That is just a rewrite. You’re not trying to invent something new, but are just leveraging all of the existing knowledge out there that you can get your hands on.

If you ignore what was done in the past, and essentially go right back to first principles, that is a reinvention. You will try to come up with your own sophisticated model that is in direct competition with what is already out there. If that existing work is already decades old, you are essentially going back in time to an earlier starting point, and you will have to find your own way from there, through all of the same problems again.

In simple cases that is not a big problem. You can craft a new GUI that looks somewhat like the older ones. They are fairly routine and usually conceptually simple.

In advanced cases though, like persistent storage, frameworks, languages, or operating systems it is a major leap backward. So much was done, discovered, and leveraged by a massive number of people, but you are just ignoring everything. It’s unlikely that by yourself, you have anything close to a tiny fraction of the time needed to relearn all that was already known. What you create will just be crude.

So we really don’t want to reinvent stuff, but that is quite different from just writing our own version of existing stuff. The key difference is that long before you start a rewrite, you had better spend a great deal of effort to acquire the state of the art first. If you are just blindly coding something that seems okay, you’re doing it wrong.

Given that distinction, it gets interesting when you apply it to smaller dependencies.

If there are some simple ones that you could have written yourself, then it is way better to write them yourself. Why? You’re not reinventing the wheel, you already have the knowledge and it is always better to directly control any code you have to put into production. If you wrote it, you understand it better, you can fix it faster, and it is likely more future-proof as well.

If there are complicated dependencies, with lots of mature features, then unless you obtain the precise knowledge of how they work, you are probably better off relying on them. But, as they are already complex technology, you still have to spend a lot of time to understand them, and you should leverage them to their fullest extent, instead of using subsets of a bunch of similar libraries.

In between, it likely depends on your experiences, the domain, and both the short-term and long-term goals of the organization. Having your own code is “value”, and building up a lot of value is usually super important, both for in-house and commercial projects. A ball of glue might be quick to build, but it is always an unstable nightmare and is easily replaced by the next ball of glue.

No comments:

Post a Comment

Thanks for the Feedback!