Thursday, March 30, 2023

Strong Coding Habits

It may be easy to dismiss all coding standards and conventions as subjective, but it’s not actually true.

No matter what you do in code, it has an effect and a bunch of side effects. If you balance out your work to align the side effects, they tend to collect together and really help make things better overall.

We’ve often seen this with popular trends in programming; that one generation does things a particular way, then the next generation, seeking to differentiate themselves, arbitrarily changes that. Oddly, the newer, more popular way of doing things isn’t always better, in fact maybe 50% of the time it is worse. The takeaway here is that just because doing something is popular right now, it doesn’t mean doing it that way is a good idea. You have to be objective about the work and why you are doing it. You shouldn’t just blindly follow some stupid trend.

With that in mind, I wanted to talk about some of the older, more defensive ways of coding, and how they really help you to write code that tends towards better quality.

The first thing comes from some older philosophies about pre-conditions and post-conditions. I think it was related to aspect-oriented programming and design by contract, but we had been using it long before those ideas surfaced. It’s a stricter version of Procedural programming, really.

The start is to use lots of functions. Lots of them. Keep them small, keep them tight. Every function should just do one thing and do it well.

The primary reason that this is a good idea is that when you write some code, it is very unlikely that you’ll just write it once, perfectly, and it will hang around forever. So, instead, we need to assume that our first draft is nothing more than that. It’s just a roughed-in version of the code, and over time we’ll need to make it a lot better. Add better error handling, enhance its capabilities, etc. Code isn’t something you “write”, it is something you “evolve”.

If you have a lot of little functions and each one encapsulates something useful, then later when you are playing with the code or extending it, or dealing with unwanted behaviors, it is easier to work at a slightly higher level over and around the pieces, aka the functions.

Maybe you thought the 3 things you needed to do were order independent, then found out later that you put them there backward. If all 3 things are interleaved into a giant mess, that is a lot of work to fix it. If it’s just calling 3 functions, then it is trivial.

So, we want that malleability, at the very small cost of executing a lot of different functions. Basically, we can see functions as some trivial syntax that wraps what you are doing. Well almost.

Cause you now have to come up with 3 unique function names, but that is actually a good thing. You know what the 3 steps are -- you can explain them to your friends -- so what words would you use in that explanation? Those words are the roots of your names, just make sure you squish them as hard as you can but don’t lose their meaning. If you do that, and the names match the code, then the names are self-describing, and that helps later when you revisit the work.

Inside of the functions, you generally have a bunch of input variables and one or more output ones. The best functions are effectively stateless, that is, everything that they touch is passed in through their arguments. They do not reference, or fiddle with anything that is global, or even scoped beyond their borders. That is a little misleading in the OO world, in that they should be touching class variables, but even there you can be tight on their behavior. For instance, some of the functions just need the class variables for read-only usage. Some might write, or update them. Keep those types of functions separate from each other.

In a function, it is possible that some of the incoming data is crap. So when we are running code in development, we really do want it to stop as early as possible on bad data. That makes it easier to figure out what went wrong, you don’t have to go backwards through the stack, and find the culprits, it was just “one” bad call.

In production, however, for some types of systems, we want to be liberal. That is, the code should blunder through the work if possible. We do that so that bugs that escaped into production can be worked around by the users. If the code stops on the first data problem, it’s over. The whole thing stops. But if it kinda works with a bug, then the users have options. It is because of that that we want two very different behaviors for bad data. In dev, it is strict, and in prod it is lenient. A good build environment should differentiate between dev and prod releases at minimum. Then we can switch the behavior as needed.

Then the next thing to do is to wrap the incoming arguments in what they used to call asserts. It is a statement that asserts that the variable only holds very specific values and that the remaining range of values is what the function below it will handle.

So, if you have a variable that could be null, but the code can’t handle nulls, you put in an assert that blows up with a null. Nulls aren’t the best example though, in that it is usually far better to support them, everywhere. So, the assert turns into a terminal condition, often of the form that if one of the inputs is null, the output is also null, or something similar.

Between the asserts and the terminal conditions, the very first part of the function is dedicated to making sure that all of the incoming variables are exactly as what is needed for the code below. That is, they primarily act as a filter, to reject any incoming crap. If the filter doesn’t reject stuff, then the code below triggers. In dev, the filter stops the program, in prod it outs warnings into the log.

Separating out the crappy variable stuff from the rest of the code has the consequence that it makes the rest of the code a lot more simple. Mixing and matching makes it really hard to tell what the code was really trying to do. So, this split is key to being able to read the code more effectively. Often you skip the asserts and other pre-conditions, then see what should have happened in the average case. Then you might look to see if the pre-conditions are strict enough.

Also, if there are incoming globals, which you can’t avoid for some reason, they exist in these pre-conditions too. You get them first, then later you play with them.

So, now you get this tighter code base, it becomes easier to start reusing parts of it. Maybe you are missing some of these smaller functions? Maybe the thing in the function itself is wrong, or a bad idea? You can refactor quite easily, it doesn’t take much effort. If you maintain the stricter coding standards, that extra work pays off hugely when you have to go back later, and you’ll always have to go back later.

There are a lot more strong habits. It’s essentially an endless pursuit. Lots of this stuff was known and forgotten, but really is effective in reducing bugs. The key point is that every programmer makes mistakes, so we should expect that. When we are building big stuff, in order to get decent quality we have to lean on good habits, technologies, different types of testing, and even branches of computer science like proof of correctness. It all depends on how much quality we need since except for good habits it is fairly expensive. Over the last few decades, the quality of work has been steadily falling, but given our world’s dependence on software, people will eventually demand better quality.

Thursday, March 23, 2023

Connectors

I’ve always hated cell phones. I think it is because I’ve been in enough difficult work situations over the years where people would call me in the middle of the night with trivial stuff if I had one. But I also don’t like those little screens and crude interfaces.

What I would like is a connector.

A little, really tough, durable device that I can use to communicate with only a tiny number of people. Oddly I want it to be a small fixed number like five, so I can politely decline other connections.

Since it is small, maybe even a pin, you can always carry it around with you. It should be able to survive bouncing around in my pockets and be waterproof as well. Basically indestructible.

It would have a little button that toggles between the different people, and another to push to talk. When you want to contact someone you pick them, hold down the talk button, and speak. That audio is converted into text and sent to the other person in the cheapest, most reliable manner.

When someone receives a message the device vibrates and a light goes on. They hit a button and the text is converted back into audio (different voice though). Maybe it only stores the last three messages per person, it is not particularly historic. If they want to respond to a message they hold down the talk button and speak.

In the back, there are read and receive receipts. You do know if the other person has successfully received and listened to the message, although not exactly when. Since they can only hold three from you, you'll know when it's full, but you can cause it to cycle if you want (if you really regret what you just said).

The only other feature is in administration. It uses Bluetooth or infrared or some other local communication for set up. That is you can only add someone when both devices are in the same room together and both people agree. Both press at the same time; it gets synced.

It’s meant for tight family connections, not general communication. Basically, instead of tracking people, you guarantee that when they get a chance they’ll hear what you said, even if they choose to ignore you. So, basically for short urgent communication or emergencies. But not suitable for gossip or rambling.

Thursday, March 16, 2023

Agile, Stand-ups, TDD and Code Reviews

When the Agile Manifesto came out 20 years ago, I really liked what it said.

At the time, I was banging away in a startup. Obviously, in that sort of environment, you can’t just go waterfall. Or even partial waterfall, or really have any water at all. It all has to be very reactive; you keep fiddling with the code until it gets traction. Then you figure out why it got traction and try to leverage that for even more traction. Startups were quite agile long before the manifesto used that term.

You can’t lay out a grand plan for yearly releases. You make some changes, polish it quickly, then get it out there for feedback. It is naturally iterative, and the cycle length is heavily dependent on the size of the work and the risks of not releasing it.

So, the essence of the Manifesto fit really well with the way we were forced to work. It’s fast-paced, hectic, and very risky. But if we get traction, it was well understood that we’d slow right down and do it properly.

Over the decades as I watched the Agile movement mature, I was often fairly vocal about what I thought were mistakes in its direction. In order to sell its ideas to corporations, Agile ended up formalizing itself.

I think that is a huge mistake, in that anything formal is inherently static, which is exactly what being agile was about avoiding. That is, either you are agile, or you are static, there really is no in-between.

Agile as a formal process is really only definable as the other parts of the heavier processes that you throw away and avoid. Whatever is left is just enough to barely keep going. Then you are truly agile.

But the other deficiency was that a lightweight process makes a huge amount of sense in the middle of the chaos of a startup. You really don’t know what you are building until you’ve built enough of it to spark interest. So you build, pivot, build, pivot, etc. Under those conditions, it doesn’t make sense to cross all the t’s and dot the i’s as the life expectancy of the code is weeks or months. Just whack it out, see if it sticks, and then go back to the drawing board. Most releases are just demos.

These two things tend to make agile an excellent fit for startups and an impossibly bad fit for large projects in large organizations. It just doesn’t scale up or formalize properly.

There are lots of problems, but we can cherrypick a few that are representative. As a great example, Stand-ups always come to mind.

A big problem in development, particularly when a lot of people are introverts, is communication.

There are millions of details, flying all over the place, and you need to get some reliable form of communicating them, even for groups that are rather quiet. Because of this, leads in tight spaces, often develop the habit of bouncing around, every morning, to all of their people and just getting a one-on-one quick update. “Is everything ok? Are you blocked on something? What do you need me to do?”

Depending on the work, the people, and the scheduling, some coders need at least a daily hookup, while others can go a week or so without a status update. So, many of the leads get into the habit of just making rounds in the morning before they get down to their own work. Which is very agile, but also subjective and variable.

But Agile decided that you can’t package that and sell it. Instead, they came up with an “optimized” method of having the lead pull everyone together, every morning, for quick updates. But since it is now a group effort, they tried to keep it time constrained by having everyone stand up during the meeting, in the hopes that people’s feet will tire out. That’s why it is called a ‘stand-up’.

If you’ve guessed it already, that is a formal and somewhat diminutive means of ensuring that the leads maintain their communication. It’s also awful and there is nothing “agile” about it. You mandate a stupid meeting, every day, even when it is useless.

As well, maybe you save the lead a bunch of time, but that comes by stealing time from the rest of the group. Mostly that is a bad idea. Communication is good, but some weird immature formalization of it is bad.

We see the same kinda thing with unit testing.

Clearly, the best way to test any software is with fully automated ‘system’ testing that is thorough enough that it can be used for full regression testing. That is really the only way to ensure really high quality, but it is also crazy expensive. In its fullest form, it is at least as much code as the thing it is going to test, and you have to build a custom version of it, every time, for every system. Way too expensive for most projects.

So, unit tests are great for ensuring that a few components amongst the whole set, have good quality. Good tests are expensive to write, but if you are careful and put your limited efforts into the right areas, it pays off huge dividends. You find way more bugs than you would have by doing manual testing. But you certainly will not find all of the bugs, and if the tests are crappy, you won’t find many bugs.

But in the crazy rush to sell Agile, the tagline was that unit testing was all that was needed. That if you somehow got full coverage in unit tests, then it would find every bug and your quality would be perfect, which is insane.

But it got worse. Some people invented a fun little game called “test driven development” (TDD). You write the unit test first, then write the code to pass that test. It’s actually a neat exercise and I’ve always thought that it should be used heavily in education. It forces a programmer to think both about the “inside” of the code, and about what is “outside” as well.

The problem is that as often is the case, some other people figured that this game was something that everyone should do, all of the time. They even said that it would, all by itself, make you a better programmer. While I think TDD has some good qualities, and I do believe it would really help new programmers learn the art, it would be useless for someone like me who has been coding, steadily, for thirty years, and it would also have no real bearing on making large complex components any better.

In that second case, the order that you create the test cases will dictate the order that you build the code, which will influence the way you build the code. That is, the very best code is directly tied to a very specific set and order of the tests. If you don’t come up with those tests in that order, you won’t arrive at that code.

You wouldn’t notice this deficiency with small examples, but the moment you started doing real stuff, and the order got bounced around by issues like changing scope, there could be no real expectation that at the end, the code was any good, let alone anything less than spaghetti. Code is only ever not a mess when someone explicitly understands it and they keep it organized. There are few external forces that can do that automatically, TDD is not one of them.

Overall testing is very important. And every expensive. So, you use very different types of tests for the different parts of the program, depending on the scheduling. You might need to make sure one piece is nearly flawless, but for some of the others it is cheaper to just toss it out there and fix it later. It depends, and it has no easy answers, thus it defies formalizing. Because if it is formal, it is exceptionally long and painful.

My last example is code reviews.

Way back, when I was very young, we had a massive project that needed super high quality. We absolutely did code reviews. Because none of the modern tools existed then, we literally rolled our own repo and the reviews were baked into that code. It was good, in that a few times I caught an epic disaster long before it escaped into the wild. You absolutely need that for high-quality work.

So, code reviews, in general, are great, and they act as a form of vetting the code before it gets out the door. Pre-testing it could be called.

The problem is that since that initial experience, pretty much every time since when I’ve seen people do code reviews, they have been worse than useless. Either they don’t seriously look at the code, or they nitpick stupid stuff. I can’t think of one example like that earlier one, where code reviews saved the day. They’ve all been silly and ineffective.

The core problem is coding standards and conventions.

In that early work, we had extremely strict standards and conventions. You 'had' to follow them, and it was obvious when you didn’t. But they weren’t arbitrary, they were put together as a means of defensive programming, which we needed to get that high quality.

When I saw code reviews later, the teams didn’t want to be strict at all, therefore the reviews themselves were compromised. That is if you don’t have a well-defined set of “rules” then reviewing any code against no rules is useless. It’s just subjective at that point.

Sadly, the industry turned away from a lot of the earlier, stricter, better practices, believing that they were too slow, and thus granting a lot more freedom to all of the programmers. But that freedom has a bunch of costs, like poor readability and obviously, making code reviews useless. You can’t code review just for the sake of saying you reviewed the code, you have to review it for some very specific reasons, like standards and conventions.

And so we see that it’s a tradeoff again. If you let any programmer do anything they want, then any sort of code review is entirely useless. If you clamp down hard on what code is acceptable, then you can review stuff to ensure that it is following those rules.

So you can’t just formalize code reviews without formalizing coding itself. It’s not going to work. And it’s like the other examples. They take something that sounds good, that in some cases is happening naturally, then try to make a “formal process or rule” out of it and add it to the methodology.

All of that would be fine if it worked, but you really don’t need to be agile or super reactive for most large development projects anyways. That is, mostly the stuff is a late-stage rewrite of some other stuff that failed because its quality degraded to rock bottom. It doesn’t need to be agile, what you need is to do a better job building stuff, so it is less flaky and lasts for longer. Good engineering is better than extreme reactivity. And that usually requires thinking, planning, learning and patience. It’s needs to be proactive.

I was very surprised that the Agile sales job did as well as it did. I guess big companies were struggling with coding anyways and were desperate for anything that might work. Still, it seems odd that a big company would want an “agile process” just like a startup, but then they cripple it by badly formalizing the “agile” parts of it away. So, you don’t get “agile”, you get an overly reactive game that tries really hard to not think ahead of itself, while mindlessly spewing out bad code forever, or what we like to call “fragile”.

Thursday, March 9, 2023

Scaling Methodologies

As it usually does, the ultimate quest for a perfect one-size-fits-all methodology has failed. This is a timeless statement, in that it doesn’t matter when or which methodology we are talking about. The quest always fails.

Why?

Because the size of a project dictates the rules and processes you need to keep it running smoothly.

A small project that is chaotic is barely distinguishable from one that is super organized, except it might finish slightly later. It’s small, failures are fine and easily recoverable. So, you want something ultra-lightweight, really easy. The work only goes on for days or weeks. It can be totally reactive. Try stuff, demo it, repeat.

But that changes with a medium project, whose duration is months. You need something a bit heavier.

If you need better quality, you need even more weight. You need to formalize more parts of the work. Mostly, it can still be dynamic, and somewhat reactive, but proactive plans increase efficiency, by avoiding redundancies and batching up stuff.

You might have to worry about staff turnover, some parts of the work should be documented just in case. Sometimes the months turn into years. So long as the codebase remains small, you can stay with the medium process.

Breaking medium projects up into lots of small projects won’t work because they are still dependent on each other. You can’t pretend that half a medium project is a small project, it is not. It is a part of a medium project. If you treat that half incorrectly, it will get out of control. You are often bound by the higher attributes of any dependency unless it's been almost fully encapsulated, like an app in a platform. Still, even in those cases, the larger dependency taints the smaller one.

Once you cross the line and the scale becomes large everything changes again. As well as the codebase organization, the process has to change as well. A lot more stuff needs to be formalized, more people added, and more communication. Way more details flying around.

When this jump occurs it is important to understand that the code needs to be refactored, or even sometimes, fully rewritten. Large projects are a pain. They require very different organization than medium ones. There are at least a dozen people involved in some way, usually a lot more, even if there is still only one coder. The code grows, the impact grows, and the consequences of bugs and failure grow. Projects that evolve eventually cross this threshold and they really need to be revamped. It is easier if it was greenfield, which is usually handled far better.

A wrong turn or dead end is really expensive now, so planning is super necessary, as well as techniques like prototypes, scale models, etc. You can’t just do stuff and see it works later, you need to be far more sure that the changes add value, and solve real problems. Everything needs to be explicit and formalized. Being reactive is pretty bad now, most stuff except production outages has to be planned out in advance; the whole ship turns very slowly.

Large systems tend to integrate with lots of other stuff around them. Letting those connections become spaghetti is a pretty serious mistake. Keeping them organized can push up the scale even higher.

At the next level, huge projects involve hundreds or even thousands of people, they go on for years or decades. Huge projects need an amazing amount of coordination, they are exceptionally slow-moving. No clever ideas will negate that. If there are billions of details, at some point, you have to accept that. Just keeping something huge headed off in a sane direction is a big problem all on its own, it doesn’t matter what you are trying to build.

The size of the project drives the budgeting, the expected lines of code, the methodology, the number of people, the tech stacks; everything. Four categories: small, medium, large, and huge, are usually enough to get across the effects of size. It’s basically an exponential scale, doubling at every new interval, which roughly matches with complexity growth. That is, some projects may feel linear with respect to time, but the dependencies tend towards a worse case of exponential growth, so we should treat it as exponential.

Software development projects are highly multi-dimensional, so it is important to not lose sight of the goal. You are assembling a collection of features that make it possible for users to solve their problems. So getting the correct implementations of those features into the hands of the users is a success. Broken implementations or missing features are not a success. A good project does this smoothly, over and over again.

Thursday, March 2, 2023

Refactoring

Sometimes it just doesn’t fit. They tried calling it “code smells” but that phrase was unsatisfactory, it just doesn’t quite capture the problem.

You know you want the function to be clean. So you move the messy lines up, down, or shift them over. But that pushes the problem to somewhere else, which sets off Yak Shaving. Each new refactor forces another, it feels endless.

You fix the names to be what you thought you should have called them initially. You fiddle with it to remove any noise and syntactic sugar. Maybe throw a comment here or there.

In the middle, you get scared that you are just going around in circles, but you persist. Too late now.

More shifting, more rearrangement, more fiddling. Some pieces get pushed on the wrong side of an architectural line, you push em back. Some parts get packed into tables, you collapse whole trees and lots of redundant code falls away.

You’re at it for days, weeks even. In between, running the code just shows that nothing works. But still, you persist.

And then almost unexpectant it all just snaps into place. A loud, non-existent thundering crack breaks the silence.

You run the code, it is nearly perfect. You check out the code, it is beautiful. Not only are the smells all gone, but most of the bugs have disappeared too. What’s left is simple and readable. It all makes perfect sense now. It looks so obvious you wonder why you didn’t see it right away. You can finally sleep.

The next day someone glances at the code and says “That’s it? Why did it take so long?”. You sigh and know deep down inside, that good code, the really valuable stuff, doesn’t just erupt from the keyboard. Angst, agony, and pain are all essential parts of what is needed to find it in a sea of really bad stuff. The quality of the code is proportional to the work you put into it. The first draft is just the beginning, not the end.