The Programmer's Paradox: March 2022

Tuesday, March 29, 2022

Functions

When I first started coding, there was a distinction between functions and procedures.

A function was essentially an operator, like mathematics. You hand it arguments, and it returns some output.

Procedures however were just a set of steps that the computer needed to follow. There was no return value, but there could still be a side effect, like setting an error code somewhere globally.

That distinction got washed away, particularly in C when the return value from a function was allowed to be void.

In a sense, if you were going to write some steps for the computer to follow, you would create a procedure. It was very focused on doing just the right steps. Inside of it, however, you might call a whole lot of functions that nicely wrapped some of the mechanics. Those functions were essentially reusable, the procedures however were not.

That seemed to be the foundation of the Procedural paradigm. All of the other organizational constructs that you need to craft large programs were left up to the coders.

That distinction between functions and procedures disappeared.

Then gradually, the idea of breaking up large computations into reusable functions vanished too.

Programmers preferred to see everything for a given procedure in its full detail. So, you’d see massive functions, often spanning pages and pages. Everything is there, very explicitly.

While in some ways it is easier to write code this way, it’s rather horrible to read these behemoths. By the time you’ve reached the end, you’ve forgotten how it all started. It’s like jamming an entire novel into one massive endless paragraph.

So, paradigms like Object-Oriented (OO) were invented to avoid this type of code. The computations, as small functions, that affect the object’s data are placed close together, close to the data.

To figure out how it works, you just need to understand how the objects interact, not what all of the underlying code does. It is a form of layering, implicitly placed into the programming language itself. If the objects themselves are relatable to the outputs, for instance, everything visible on the screen is its own object, then it makes it way easier to debug. You just see what the problem is when it runs, then adjust a few objects' behaviors to correct it.

The earliest OO code often had a huge amount of tiny functions, one-liners were common.

But as time went on, more and more Procedural bad habits invaded the work. To the point where now you often see a top-level of faux objects, pretty much at that old procedure level, and then just straight up Procedural code inside.

The problem with Procedural though was that because it didn’t explicitly define how the code should be organized there was always a very wide range of different possibilities. Some programmers would make really good choices that resulted in simple, obvious, and readable code. But more often, it was just a twisty mess, hopelessly hacked at until it sort of worked. Freedom, particularly for inexperienced coders, is rarely a good thing.

The reinvasion of Procedural styles also had the unwanted effect of making programmers afraid of functions. Less was considered better. And later, they became afraid of layers too.

Sure, in the fairly early days one could argue that function calls could be expensive at times, but that hasn’t been the case for decades. It is far better to just assume that they are free, particularly since removing them later as an optimization is a whole lot easier than trying to add them into an existing mess to make it more readable.

Functions have another huge tactical advantage. Since they need to name a ‘virtual concept’, the computer doesn’t care if they exist or not, they act as an explicit form of documentation.

For most OO code, the objects are the nouns from the technical side or business domain, while the functions, usually referred to as ‘methods’ are the verbs. That means that it is possible to decompose a given specification for a feature almost directly into its nouns and verbs, keeping a near 1:1 mapping between the specification and the code.

In a way, it solves the naming crisis. If you can explain, in full, how you are going to solve a problem for somebody with software, then you also have the names of practically every variable and method. Well, at least the major ones, some of the underlying transformations might not have explicit terminology in the descriptions, but they usually do in either the software industry or the business domain itself.

But even if you don’t use the proper names, so long as you are consistent, the meaning and intent of the code is clear. And if another programmer finds better wording, most IDEs allow for easy name changing. So the code can be improved quite easily.

Given the original split between seeing the code trigger from a high up endpoint and the low-level mechanics, it makes a lot of sense to embed this into the layering.

You want all of your endpoints to be ultralight. Then where you have an explicit set of steps, it is best to wrap each one in a well-named self-describing function.

Below this, however, you want as much reuse as you have time to add. Mostly because it will cut down on massive amounts of work, testing, bugs, and reduce stress. If the interactions at this lower level are properly coded to always do the right thing, the low-level testing drops to nothing as the code matures. That might not make sense for a quick little project, but most systems stay in active development for years and years, so it is a huge boost forward.

Although Procedural, and essentially brute force, are the industry's preferred means of coding, they are way more time-intensive and the lack of readability significantly shortens the lifespan of the code.

The work may get out of the gate faster, but all of the redundancies and make-work quickly catch up with the progress, killing it prematurely. If instead, you try hard to keep the functions small and tightly focused, it leaves open the possibility to apply strong refactorings to the code later to remove any systemic problems with the initial rushed construction. This makes it possible to grow the code, instead of just keep reinventing it all of the time. It’s a much better approach to building stuff.

Sunday, March 13, 2022

Paradigms, Patterns and Idioms

To build large sophisticated software, you have to learn how to avoid wasting time on make-work, particularly with creating redundant code.

If the codebase is small, duplicating a few things here or there really isn't noticeable. But when the code is large or complex, it becomes a crushing problem. Sloppy coding doesn’t scale.

Over the decades, lots of people have defined or discussed the issues. Some of their findings make it into practice, but often as new generations of programmers enter the field, that type of knowledge gets ignored and forgotten. So it’s always worth going back over the fundamentals.

A paradigm is an overall philosophy for arranging the code. This would include Procedural, Object-Oriented, Functional, ADT, Lambda Calculus, or Linear Algebra, but there are probably a lot of others.

Procedural is basically using nothing but functions, there is no other consistent structure. OO is decomposing by objects, Functional tends to be more by Category Theory, while languages like Lisp try for lambda calculus. APL and MatLab are oriented around vectors and matrices from linear algebra.

Object-Oriented is the most dominant paradigm right now; you decompose your code into a lot of objects that interact with each other.

However, lots of programmers have trouble understanding how to do that successfully, so it’s not uncommon to see minimal objects at the surface that are filled with straight-up Procedural code. There are some functions, but the code tends to hang together as very long sequences of explicit high and low-level instructions. The wrapping objects then are just proforma.

A paradigm can be used on its own, or it can be embedded directly into the programming language. You could do Object-Oriented coding in straight-up C —it is messy but possible — but C++ makes it more natural. The first C++ compilers were just preprocessors that generated C code. That’s a consequence of all of these languages being Turing-Complete, there is almost always a way to accomplish the same thing in any language, it’s just that the syntax might get really ugly.

We even see that with strict typing; if there is a generic workaround like Object, you can implement loose typing. Going the other way is obvious. In that sense, we can see strict typing as a partial paradigm itself that may be implemented directly in the language or layered on top.

Some languages allow you to mix and match paradigms. In C# you can easily do Procedural, Object-Oriented, Functional, and Declarative, all at the same time, intermixed. However, that is an incredibly bad idea.

Complexity exists at all of the borders between paradigms, so if the code is arbitrarily flipping back and forth, there is a huge amount of unnecessary artificial complexity that just makes it harder to infer that the code is doing what it is supposed to be doing.

So the key trick for any codebase is to commit to a specific paradigm, then stick to it.

We tend to allow new programmers to switch conventions on the fly, but that is always a mistake. It’s a classic tradeoff, save time today in not learning the existing codebase, but pay for it a few releases later. Software is only as good as the development team’s ability to enforce consistency. Big existing codebases have a long slow ramp-up time for new coders, not accepting that is a common mistake.

Within a paradigm, large parts of the work may follow a routine pattern. Before the GoF coined the term ‘Design Patterns’ people used all sorts of other terms for these repeating patterns such as ‘mechanisms’.

It might seem that data structures are similar, but a design pattern is a vague means of structuring some code to guarantee specific properties, whereas data structures can be used as literal units of decomposition.

One common mistake called Design Pattern Hell is to try and treat the patterns as if they were explicit building blocks; you see this most often when the pattern becomes part of the naming conventions. Then the coders go through crazy gyrations to take fairly simple logic and shove it awkwardly into the largest number of independent patterns. Not only does it horrifically bloat the runtime, but the code is often extra convoluted on top. Poor performance and poor readability.

But patterns are good, and consistently applying them is better. Partly because you can document large chunks of functionality just by mentioning the pattern that influenced the code, but also because programmers often get tunnel vision on limited parts of the computations, leaving the other behaviors as erratic. Weak error handling is the classic example, but poor distributed programming and questionable concurrency are also very popular. If you correctly apply a complex pattern like flyweights or model-view-controller, the pattern assures correctness even if the coder doesn’t understand why. There are far older patterns like using finite state machines as simple config parsers or producer-consumer models for handling possible impedance mismatches. Applying the patterns saves time and cognitive effort, while still managing to not reinvent the wheel. It’s just they aren’t literal. They are just patterns.

At a lower level are idioms. They usually just scope very tightly around a specific block or two of code. Idioms are the least understood conventions. You see them everywhere, but few people recognize them as such.

Some sub-branches of coding like systems programming rely more heavily on applying them consistently, but most of the less rigorous programming like application code doesn’t bother. The consequence is when you get problems like application code dipping into issues like catching or locking it is usually very unstable. Kinda works, but not really. To get it right, for example, for locking, means choosing the right idiom and rigorously making sure it is applied everywhere. Inconsistencies usually manifest as sporadic failures that are nearly impossible to debug.

There are way too many idioms to try and list them. They cover everything from the way you declare your code, to guard checks like asserts, to the way loops are unrolled. Most of them either enforce strictness, help with performance, or assure other critical properties. They are forgotten as fast as they are invented, and often differ by language, tech stack, culture, or expected quality. But finding good strong idioms is one of the key reasons to read a lot of other people's code; literate programmers are always stronger than illiterate ones.

Enforcing idioms is tricky. It should be part of code reviews, but a lot of people mistake idioms for being subjective. That might be livable for hasty in-house applications programming, but idioms really are key to getting quality and stability. Consistency is a key pillar of quality. You might initially pick a weak idiom —it happens —but if you were consistent, it is nearly trivial to upgrade it to something better. If you weren’t consistent, it’s a rather nasty slog to fix it, so it will probably get stuck in its low-quality state.

The biggest problem with all of these concepts is that most programming cultures strongly value freedom. The programmers want the freedom to code things whichever way they feel like it. One day they might feel ‘objecty’, the next it is ‘functionish’.

But being that relaxed with coding always eats away at the overall quality. If the quality degrades far enough and the code is really being used by real people for real issues, the resulting instability will disrupt any and all attempts to further improve the code. It just turns into a death march, going around in circles, without adding any real value. Fixing one part of the system breaks the other parts, so the programmers get scared to make significant changes, which locks in the status quo or worse. Attempts to avoid that code by wrapping layers around it like an onion might contain the issue but by throwing in lots of redundant code. Same with just adding things on the side. All of these redundancies are make-work, they didn’t need to happen, forced on people because they are avoiding the real issues.

Ultimately coding is the most boring part of software development. You’ve identified a tangible problem and designed a strong solution, now you just need to grind through the work of getting it done. The ever frequent attempts at making code fun or super creative, have instead just made it stressful and guaranteed to produce poor quality. It seems foolish to just keep grinding out the same weak code, only to throw it away and do it all over again later. If we’d just admit that it is mostly routine work, maybe we’d figure out how to minimize it so that we can build better stuff.

Saturday, March 5, 2022

Piecewise Construction

One of the popular ways to build software is to start by focusing on a smaller piece of a larger problem. You isolate that piece, code it up, and then release it into the wild for people to use.

While that might seem like the most obvious way to gradually build up large systems, it is actually probably the worst way to do it.

There are two basic problems. The first is that you end up building the same stuff over and over again, resulting in a lot of extra and unnecessary work. The second is that people usually choose the easiest pieces as the starting point, leaving the bigger and often more important problems to possibly never get solved or to show up way too late.

If you look at a big system and follow the code all the way down to its persistence based on features, you will find that for similar entry points, the differences are small. A gui screen that shows a list of users, for example, is likely 90% similar to one that shows some other domain data. It’s all basically all the same. The code differs at the top, in the way the screen is wired, and at the bottom, in the schema, but the rest of it is doing the exact same thing as at least one other piece of code in the system. The same is true for any data import or data export code. There are variations needed for each but if you isolate them, they are always small.

Modern software construction is all about rushing through the work, and the fastest way most people think to get a new piece constructed is to ignore most of what is already there. Writing new code is far faster than reading old code. Reading old code is hard. High turnover in development projects just makes that worse. So, we see that the pieces are often written as new detached silos, one after another.

The problem with that is that the pieces are rarely completely independent at a domain and/or technical level, so the newer pieces will need to do similar things to existing pieces, but slightly differently, which causes a lot of issues. These mismatches are sometimes very difficult to find because the pieces can all have different styles, conventions, idioms, and even paradigms. To really spot the problem would require a deep understanding of at least two pieces, if not more. But that means reading even more code. So, people usually just apply junk code to patch it in a few places, instead of fixing it.

If you can accept that say 90% of the code is duplicated in at least one other place in the codebase, the consequence of this is that at minimum, you are doing at least 10x more work to get the system built. If the redundancy is greater, then the excess work is far higher. On top of that, testing is proportional, so if you write 10x more code, you should do 10x more testing. The less you test, the more bugs get released into the wild, and when these blow up they will disrupt working on the next round of pieces. So it is getting worse, not better.

There is a simple and obvious measure to indicate when this is a significant problem with an existing development project. If the construction technique is extending the system, then later additions will get easier to accomplish. If it is piecewise, the ramp-up time for new programmers will be tiny, but each and every new piece will take longer and longer to get done. Partly because of the redundancies, but also because of the increasing disruptions from bugs already leaked out.

For programmers, you can tell which type of project is which when you start working on that codebase, but for management, they can not use the opinion of any new programmers to correctly inform them because they can’t tell if it really is a mess of piecewise construction, or if the programmer just doesn’t like to read other programmer’s code. So, it's better for management to gauge the progress using programmers with longer experience on the project. Given some work similar to earlier work, does it take more or less time to get that accomplished? Is that trend changing? Because it is difficult to do that in practice, it usually is avoided too.

The other problem is usually worse. Most systems these days are attempted rewrites of earlier systems that are often using older technologies. Those earlier systems though, are often rewrites of preceding systems themselves.

Going at a piecewise rewrite means tackling the easy and obvious pieces first, then gradually getting to at least the scope of the earlier project. But mid-way through, in order to justify the expenditure, the rewrite faces a lot of pressure to go live. It does so with a reduced scope from the earlier system, and many of the hardest problems are still left to be dealt with. And because it is piecewise, the work is getting harder and messier with each new release.

So, it is set up for failure even before the work has started.

The longer trend is that over generations of systems, the hard problems tend to keep getting ignored and they grow worse. Often they become packaged up into different “systems” themselves, and so the problems repeat themselves. Suddenly there are lots and lots of systems, each getting more trivial, most of which are incomplete and the lines between them are not there for sane architectural reasons, but rather as an accident of the means of construction.

It’s odd in that at the same time the availability of frameworks and libraries should have meant less time and better results. But that is rarely the case. When programmers had to write more of the codebase themselves, the systems were obviously cruder, but they had to manage their time better. They learned strong skills to help them avoid redundancies. A release cycle would be measured in years, but as that decreased to months and then to weeks, piecewise coding tended to dominate. You can pound out a new low-quality screen in a couple of weeks if you just redo everything, but that code won’t mesh well with the rest of the system, and most of that time was actually wasted. So, with constant short releases, it is far less risky to just use brute force and not worry about the longer-term problems, even if it is obvious that they are getting worse.

Realistically, the faster a system comes out of the gate, as it gets going initially, the more likely that the code will be assembled with piecewise construction. This was supposed to be better than the older, slower, more thoughtful ways that sometimes got stuck in analysis paralysis or even released stuff that deviated too far from being useful, but ultimately because it is just another extreme the results aren’t any better. An endless cycle of rewriting earlier failed piecewise construction projects with more piecewise construction is just doing the same thing over and over again but expecting the results to change.