Monday, October 5, 2020

The Good, the Bad, and the Ugly

Some people believe that the quality of code is subjective. That one person's awful code is another person’s beautiful code.

There are some sprinkles of truth buried in those beliefs, but it’s not enough to actually validate it.


The key point is that the author of any piece of code is often biased. Their code is always good, no matter how bad it really is. So, if we want a better assessment of ‘quality’ it has to be from the perspective of the later coders who end up working on it, once the original author has left. 


If you can read code, then bad code is obvious. It takes a longer time than normal to work through its issues, to figure out what it is doing. So it’s easily a time problem. If the code should have taken a day to understand, but a week later you are still confused, then it is pretty safe to say that it is bad. Well, almost. Many programmers can’t ‘read’ code, they are functionally illiterate. They can write stuff, or copy and paste it from somewhere else, but it would take them a very long time to read and understand anyone else’s code, whether it was good or bad.


That’s what injects so much confusion into quantifying ‘quality’. If a programmer struggles for a week to understand some code, you can’t tell if it's the code or the programmer, or both, that is having the problem.


On top of this, there are stylistic issues, idioms, and abstractions. Encountering a new idiom in someone else’s code, for example, can really slow down the reader, particularly if they don’t recognize it as such. What might be weird and unnatural to one programmer might be a very common idiom to a different group of them. 


Even with all of these issues, we can really think in terms of expected base time, for a programmer with the correct knowledge, that would be necessary for understanding. So we can talk about bad code as being way slower to read, okay code as being more or less readable, and good as code that is easily extendable. 


There are a huge number of different ways to make code bad. It can be obfuscation, fragmented, stupidly clever, or just obscure its intent using all sorts of tricks. Bad formatting, rampant inconsistencies, and awful naming help a lot too. 


There are way fewer versions that are okay. Still lots of different permutations, but it is far easier and more creative to write bad code. Okay-code is readable, and it isn’t onerous to make a bug fix. 


There are a fairly small number of variations for good code. Primarily because the code has to be technically strong, but also map back to the business problems or implement a strong abstraction. If someone asks for an extension, and you find it really straightforward to make those changes, then you know that it is good. 


There is such a thing as great code, but it is exceedingly rare. Usually, it is abstract, but in a way that lets people leverage its power for all sorts of unexpected usage, and it too is pretty easy to extend (if you understand the abstraction).


It’s worth noting that just because code is believed to be working in production, doesn’t make it good code. Most systems have at least hundreds of bugs that exist but haven’t been triggered yet, and rust is always eating away at weak constructs. Working code contributes to the system's current stability, but it can also freeze the ability to keep developing it and suddenly become unstable when the usage suddenly changes. It might just be a bomb waiting to go off at an inconvenient time.


So, we can’t really point to just one version of the code and say that that is ‘perfect’, but we can get a sense of quality and also an understanding that as it increases, there are considerably fewer possible variations. A small number of actual implementations is good, a larger number are okay and the rest are just bad, but may not cause immediate grief. Experienced, literate, programmers can tell the difference, so it is far less subjective than most people realize.

No comments:

Post a Comment

Thanks for the Feedback!