Thursday, November 2, 2023

Special Cases

One of the trickiest parts of coding is to not let the code become a huge mess, under the pressure of rapid changes.

It’s pretty much impossible to get a concrete static specification for any piece of complex software, and it is far worse if people try to describe dynamic attributes as static ones. As such, changes in software are inevitable and constant. You write something, be prepared for it to change, it will change.

One approach to dealing with this is to separately encode each and every special case as it own stand-alone siloed piece of code. People do this, but I highly recommend against it. It is just an exponential multier in the amount of work and testing necessary. Time that could have been saved by working smarter.

Instead, we always write for the general case, even if the only thing we know today is one specific special case.

That may sound a bit weird, but it is really a mindset. If someone tells you the code should do 12 things for a small set of different data, then you think of that as if it were general. But then you code out the case as specified. Say you take the 3 different types of data and put it directly through the 12 steps.

But you’ve given it some type of more general name. It is isn’t ProcessX, it something more akin to HandleThese3TypesOfData. Of course, you really don’t want the name to be that long and explicit. Pick something more general that covers the special case, but does not explicitly bind to it. We’re always searching for ‘atomic primitives’, so maybe it is GenerateReport, but it only actually works for this particular set of data and nly goes these 12 steps.

And now the fun begins.

Later, they have a similar case, but different. Say it is 4 types of data, but only 2 overlap with the first instance. And it is 15 steps, but only 10 overlap.

You wrap your generate report into some object or structure that can hold any of the 5 possible datatypes. You set an enumeration that switches between the original 12 steps and the newer 15 steps.

You put an indicator in the input to say which of the 2 cases match the incoming data. You write something to check the inputs first. Then you use the enum to switch between the different steps. Now someone can call it with either special case, and it works.

Then more fun.

Someone adds a couple more special cases, You do the same thing, trying very carefully to minimize the logic wherever possible.

Maybe you put polymorphism over the input to clean that up. You flatten whatever sort of nested logic hell is building up. You move things around, making sure that any refectoring that hits the original functionality is non-destructive. In that way, you leverage the earlier work, instead of redoing it.

And it continues.

Time goes by, you realize that some of the special cases can be collapsed down, so you do that. You put in new cases and collapse parts of the cases as you can. You evolve the complexity into the code, but make sure you don’t disrupt it. The trick is always to leverage your earlier work.

If you do that diligently, then instead of a whole pile of spaghetti code, you end up with a rather clean, yet sophisticated processing engine. It takes a wide range of inputs but handles them and all of the in-between permutations correctly. You know it is correct because, at the lower levels, you are always doing the right thing.

It’s not ‘code it once and forget it’, but rather carefully grow it into the complicated beast that it needs to become in order to be useful.

No comments:

Post a Comment

Thanks for the Feedback!