Thursday, December 12, 2024

In Full

One of the key goals of writing good software is to minimize the number of lines of code.

This is important because if you only needed 30K lines, but ended up writing 300K, it was a lot of work that is essentially wasted. It’s 10x the amount of code, 10x the amount of testing, and 10x the bugs. As well, you have to search through 10x lines to understand or find issues.

So, if you can get away with 30k, instead of 300K, then is it always worth doing that?

Well, almost always.

30K of really cryptic code that uses every syntactic trick in the book to cleverly compress the code down to almost nothing is generally unreadable. It is too small now.

You always have to revisit the code all of the time.

First to make sure it is reusable, but then later to make sure it is doing the right things in the right way. Reading code is an important part of keeping the entire software development project manageable.

So 30K of really compressed, unreadable stuff, in that sense is no better than 300K of really long and redundant stuff. It’s just a similar problem but in the opposite direction.

Thus, while we want to shrink down the representation of the code to its minimum, we oddly want to expand out the expression of that code to its maximum. It may seem like a contradiction, but it isn’t.

What the code does is not the same as the way you express it. You want it to do the minimum, but you usually want the expression to be maximized.

You could for instance hack 10 functions into a complex calling chain something like A(B(C(D(E(F(G(H(I(K(args))))))) or A().B().C().D().E().F().G().H().I().K(args) so it fits on a single line.

That’s nuts, and it would severely impair the ability of anyone to figure out what either line of code does. So, instead, you put the A..K function calls on individual lines. Call them one by one. Ultimately it does the same thing, but it eats up 10x lines to get there. Which is not just okay, it is actually what you want to do.

It’s an unrealistic example since you really shouldn’t end up with 10 individual function calls. Normally some of them should be encapsulated below the others. 

Encapsulation hides stuff, but if the hidden parts are really just sub-parts of the parent, then it is a very good form of hiding. If you understand what the parent needs to do, then you would assume that the children should be there. 

Also, the function names should be self-describing, not just letters of the Alphabet. 

Still, you often see coding nearly as bad as the above, when it should have been written out fully.

Getting back to the point, if you had 60K of code that spelled out everything in full, instead of the 30K of cryptic code, or the 300K of redundant code, then you have probably reached the best code possible. Not too big, but also not too small. Lots of languages provide plenty of synaptic sugar, but you really only want to use those features to make the code more readable, not just smaller.

Friday, December 6, 2024

The Right Thing

Long ago, rather jokingly, someone gave me specific rules for working, the most interesting of them was “Do the Right thing”.

In some cases with software development, the right thing is tricky. Is it the right thing for you? For the project? For using a specific technology?

In other cases though, it is fairly obvious.

For example, even if a given programming language lets you get crazy with spacing, you do not. You format the source code properly, each and every time. The formatting of any source file should be clean.

In one sense, it doesn’t bug the computer if the spaces are messed up. But source code isn’t just for a computer. You’ll probably have to reread it a lot, and if the code is worth using, aka well written, lots of other people will have to read it as well. We are not concerned with the time it takes to type in the code or edit it carefully to fix any spacing issues. We are concerned with the friction that bad spacing adds when humans are reading the code. The right thing to do is not add unnecessary friction.

That ‘do extra now, for a benefit later’ principle is quite common when building software software. Yet we see it ignored far too often.

One place it applies strongly, but is not often discussed is with hacks.

The right thing to do if you encounter an ugly hack below you is not to ignore it. If you let it percolate upwards, the problem with it will continue. And continue.

Instead, if you find something like that, you want to encapsulate it in the lowest possible place to keep it from littering your code. Don’t embrace the ugliness, don’t embrace the mess. Wrap it and contain it. Make the wrapper act the way the underlying code should have been constructed.

The reason this is right is because of complexity. If everybody lets everybody else’s messes propagate everywhere, the artificial complex in the final effort will be outrageous. You haven’t built code, you just glued together a bunch of messes into a bigger mess. The value of what you wrote is scarce.

A lot of people like the ‘not my problem’ approach. Or the ‘do the laziest thing possible’ one. But the value you are building is essentially crippled with those approaches. You might get to the end quicker, but you didn’t really do what was asked. If they asked me to build a system, it is inherent that the system should be at least good enough. If what I deliver is a huge mess, then I failed on that basic implicit requirement. I did not do my job properly.

So the right thing is to fix the foundational messes by properly encapsulating them so that what is built on top is 1000x more workable. Encapsulate the mess, don’t percolate it.

If you go back and look at much older technology, you definitely see periods where doing the right thing was far more common. And not only does it show, but we also depend on that type of code far more than we depend on the flakey stuff. That’s why the length of time your code is in production is often a reasonable proxy to its quality, which is usually a manifestation of the ability of some programmers to find really clean ways of implementing complicated problems. It is all related.