Tuesday, March 29, 2022

Functions

When I first started coding, there was a distinction between functions and procedures.

A function was essentially an operator, like mathematics. You hand it arguments, and it returns some output.

Procedures however were just a set of steps that the computer needed to follow. There was no return value, but there could still be a side effect, like setting an error code somewhere globally.

That distinction got washed away, particularly in C when the return value from a function was allowed to be void.

In a sense, if you were going to write some steps for the computer to follow, you would create a procedure. It was very focused on doing just the right steps. Inside of it, however, you might call a whole lot of functions that nicely wrapped some of the mechanics. Those functions were essentially reusable, the procedures however were not.

That seemed to be the foundation of the Procedural paradigm. All of the other organizational constructs that you need to craft large programs were left up to the coders.

That distinction between functions and procedures disappeared.

Then gradually, the idea of breaking up large computations into reusable functions vanished too.

Programmers preferred to see everything for a given procedure in its full detail. So, you’d see massive functions, often spanning pages and pages. Everything is there, very explicitly.

While in some ways it is easier to write code this way, it’s rather horrible to read these behemoths. By the time you’ve reached the end, you’ve forgotten how it all started. It’s like jamming an entire novel into one massive endless paragraph.

So, paradigms like Object-Oriented (OO) were invented to avoid this type of code. The computations, as small functions, that affect the object’s data are placed close together, close to the data.

To figure out how it works, you just need to understand how the objects interact, not what all of the underlying code does. It is a form of layering, implicitly placed into the programming language itself. If the objects themselves are relatable to the outputs, for instance, everything visible on the screen is its own object, then it makes it way easier to debug. You just see what the problem is when it runs, then adjust a few objects' behaviors to correct it.

The earliest OO code often had a huge amount of tiny functions, one-liners were common.

But as time went on, more and more Procedural bad habits invaded the work. To the point where now you often see a top-level of faux objects, pretty much at that old procedure level, and then just straight up Procedural code inside.

The problem with Procedural though was that because it didn’t explicitly define how the code should be organized there was always a very wide range of different possibilities. Some programmers would make really good choices that resulted in simple, obvious, and readable code. But more often, it was just a twisty mess, hopelessly hacked at until it sort of worked. Freedom, particularly for inexperienced coders, is rarely a good thing.

The reinvasion of Procedural styles also had the unwanted effect of making programmers afraid of functions. Less was considered better. And later, they became afraid of layers too.

Sure, in the fairly early days one could argue that function calls could be expensive at times, but that hasn’t been the case for decades. It is far better to just assume that they are free, particularly since removing them later as an optimization is a whole lot easier than trying to add them into an existing mess to make it more readable.

Functions have another huge tactical advantage. Since they need to name a ‘virtual concept’, the computer doesn’t care if they exist or not, they act as an explicit form of documentation.

For most OO code, the objects are the nouns from the technical side or business domain, while the functions, usually referred to as ‘methods’ are the verbs. That means that it is possible to decompose a given specification for a feature almost directly into its nouns and verbs, keeping a near 1:1 mapping between the specification and the code.

In a way, it solves the naming crisis. If you can explain, in full, how you are going to solve a problem for somebody with software, then you also have the names of practically every variable and method. Well, at least the major ones, some of the underlying transformations might not have explicit terminology in the descriptions, but they usually do in either the software industry or the business domain itself.

But even if you don’t use the proper names, so long as you are consistent, the meaning and intent of the code is clear. And if another programmer finds better wording, most IDEs allow for easy name changing. So the code can be improved quite easily.

Given the original split between seeing the code trigger from a high up endpoint and the low-level mechanics, it makes a lot of sense to embed this into the layering.

You want all of your endpoints to be ultralight. Then where you have an explicit set of steps, it is best to wrap each one in a well-named self-describing function.

Below this, however, you want as much reuse as you have time to add. Mostly because it will cut down on massive amounts of work, testing, bugs, and reduce stress. If the interactions at this lower level are properly coded to always do the right thing, the low-level testing drops to nothing as the code matures. That might not make sense for a quick little project, but most systems stay in active development for years and years, so it is a huge boost forward.

Although Procedural, and essentially brute force, are the industry's preferred means of coding, they are way more time-intensive and the lack of readability significantly shortens the lifespan of the code.

The work may get out of the gate faster, but all of the redundancies and make-work quickly catch up with the progress, killing it prematurely. If instead, you try hard to keep the functions small and tightly focused, it leaves open the possibility to apply strong refactorings to the code later to remove any systemic problems with the initial rushed construction. This makes it possible to grow the code, instead of just keep reinventing it all of the time. It’s a much better approach to building stuff.

No comments:

Post a Comment

Thanks for the Feedback!