The Programmer's Paradox: August 2022

Saturday, August 27, 2022

Would you like Fries with that?

I often run into software developers who have very poor relationships with their users.

I’m sure there are a few users-from-hell out there, but often the users are okay, and still, the relationship is quite bad. That’s unfortunate and unnecessary, and it just contributes to stress on both sides. Mostly, it can be avoided.

The first part of fixing the relationship is acceptance.

The software industry will tell you that you are an artisan, a craftsperson, a magician, or at least super intelligent. From this many software developers come to believe that they should be worshipped. That the users should be thankful for any tiny amount of effort that they make. The developers think they are special, above it all.

But the truth is far less romantic. The users have a problem. It’s a problem that a computer can help solve. So they hired a bunch of people to build things that solve their problems. Those people are highly skilled and hopefully professional, but they are still in their positions just to build solutions for other people’s problems.

In that sense, from the user perspective, while it is different from going to a fast food restaurant and ordering a meal, it isn’t really that different. They ask the developers to solve specific problems, then the developers figure it out, and give them back a solution. The users often fund, initiate, and drive the work; they do have a reasonable expectation of ‘service’.

“PROGRAMMING IS NOT A SERVICE INDUSTRY!!” I can hear some readers screaming.

Well, it is not entirely a service industry and there are certainly programming jobs where there are zero service components, but since most software is used by people to solve their problems then most of it has significant service components. We deliver features, it’s built right into our job descriptions.

I know it is very hard to accept that, but accepting it is the first major step to not having a contentious relationship with the users.

If the users suddenly want you to make some quick hacks to an existing screen that seem unnecessary to you, you probably should not say “would you like fries with that?” but you should definitely think it. It will put the conversation in the right frame for you.

Now in no way am I recommending that you just blindly do whatever crazy thing the users want you to do. You don’t need to take them literally.

Often their “solutionizing” is just the way they are trying to express some aspect of their problem to you. You need to see through what they are saying to what they actually need. When you do see it you should feel free to redirect them to better solutions, ones that fit better into the overall context.

In that sense, if you have engaged with them and listened to their problems, you can help lead them in a more positive direction. But you still have to solve their problems.

Saying “no!” or “I’m too busy...” isn’t just rude it is also self-defeating. Making up some lane excuse isn’t any better. Even if you don’t know what is driving their desires, they are still expressing a need, and you have to spend some time figuring it out.

Extreme candor isn’t necessary, but you need to be honest with people for them to believe you. If they believe you, they will eventually trust you. And if they don’t trust you, they certainly are not going to take your advice; things won’t go well.

Of course, if you fix this, it will be way better in the future, but you still need to always treat them with respect.

You want to stick to just the facts, be as honest as you can, and rephrase stuff into their terminology. You want to let them know what is happening, but not make excuses. If there are good, viable, options you can present them, but if not, don’t phrase it as a question. If there isn’t a choice don’t offer one. If you don’t know something, admit it. If someone got sloppy, admit that too. It’s okay to say “that release didn’t go as planned” instead of pretending that it worked.

We’re not perfect and some things that seemed easy turn out to be brutally hard or impossible. If you're honest and they trust you, they’ll appreciate you letting them know. Not everything, and not to blame other people, just more of an understanding with them.

You know that it is your job to provide usable solutions to their problems; if they know that too the respect will be mutual. And that will erase a lot of unnecessary stress and give you the ability to concentrate harder on your work. Just don’t forget to ask a lot of questions, they do understand their problems a lot better than you ever will.

Tuesday, August 23, 2022

End-points and Computations

There are lots of different ways to decompose large software projects.

A strong decomposition that is applied consistently across a system forms the base of good organization, which make the development smoother and provides better quality.

One way to look at the different types of code in any large system is to separate it between end-points and computations.

We’ll start computations.

If you have a bunch of inputs, you can apply some work to them, and you’ll end up with a bunch of outputs. That is a simple, rather pure, stateless computation.

Way down on the nearly trivial scale, we have operators like addition. You take 2 integers, add them together and provide the result. You can go slightly higher up to something like string concatenation, where you join two strings to form a larger one.

But it also applies to much higher, larger groups of instructions. For instance, you might calculate some complex metric like a bond yield, from the description and current time series around the bond in a market. Way more information than addition or concatenation, but still the same general idea. It’s just a computation.

It’s stateless, and everything you need to compute successfully comes in from the inputs. Then it either works or it gives you a reason for not being successful.

Described that way, you can see that ‘compiling’ for a language like C or Golang is in itself also just a computation. You give it the ‘source’ and you end up with a binary of some type or a list of very specific errors.

But we can go even higher. You might give some piece of code a URL, and some navigation stuff, and it will return a clump of data like JSON. It's still a computation, just one part of it is distributed. It triggers one or more other machines to do their computations based on the input you sent it.

So you could structure the code that calls someone else’s REST API as a series of stateless computations. And if the API were somewhat stateful itself, you can just take the output of one call and use it as the input for another, and still keep it somewhat stateless. At least each of the calls is stateless, even if the combined interaction is not.

We can also see that going to some large backend for persistent data, say an RDBMS or NoSQL database, is the same. We might give it an id for something, and it returns all of the associated data with that id, in a particular structure. Still a computation, and still devoid of state on each call.

Then that leads us to the definition of an end-point. Really it is any leftover code that is itself not a computation.

For instance, in the backend rest API, there is some routing code that bonds the URL to the code you want it to execute. Sort of a computation, but not really. It’s just the end-point mechanics to route incoming things to the right handlers. You could pull out any simple computations from the mechanisms used to actually trigger the code.

A GUI might have a bunch of buttons that people can press. As they do, sometimes a ‘context’ builds up. Then at some point that leaves the interface end-points and triggers the desired computation. Maybe if it’s a web app, the app itself is mostly end-points, and the backend directs it to the correct computation.

So, any end-point code is stateful, contextual, configurable, etc. Often quite messy. All of the other bits of code that are necessary to wire up stuff to users or other computers, to run correctly. It could include operational issues, platform issues, configuration, etc.

And it tends to be the code that runs into the most difficulty.

It’s not that hard to write a computation, and while it might take a bit of work to get a multi-party distributed computation working correctly, it is fairly easy to test its behavior.

It is hard though to set up a bunch of end-points and make sure that they are durable enough to withstand the chaos around them. So, end-points tend to have a lot more bugs. They are the front line, where all of the problems originate.

So, now if you can clearly separate these two different types of code for a large system, it opens up a lot of good organizational properties.

For instance, you know to put all of your computations into shared libraries, so that a lot of other people can use them too. But you also know that the end-points are specific, and tend to be ugly and redundant. So, you don’t waste a lot of time trying to figure out how to reuse them. They tend to be single-use. At best maybe you provide a skeleton or template or something to get people up and going faster.

If you lean on that perspective, you realize that minimizing end-points is a great thing, and maximizing the computations is good too.

When we have talked about building up reusable lego blocks from the ground up, it usually means the computations. Where we have talked about just writing things up quickly, it usually means the end-points. And if you have a lot of thin end-points separated from libraries of shared computations, you have a great deal of flexibility in how you will deploy stuff, but also the ability to leverage the bulk of the work you have already done.

Sunday, August 14, 2022

The Code

I am a software developer.

I will not lie to anyone on my team.

I will value engineering far above process.

I will value completing work far above discourse or politics.

I will spend time with the end users and empathize with their problems.

I will take the time to really understand what I am coding.

I will try to keep things organized even in the face of intense time pressure.

I care if my code does not meet the expectations of a) the end users b) operations c) testers and d) myself.

I will take the extra time and effort necessary to ensure that the code works.

I will spend the time to investigate anything I don’t understand.

I will follow suit with any already existing code in our codebase and always try to leverage it first, before blindly reinventing it.

I will use all existing dependencies to their fullest extent before dumping in new ones.

I will not blindly copy code from answer sites and assume it works. Instead, I will read it, figure out what it does, and then learn.

I will communicate with everyone and make sure they are all on the same page. I will not hide any relevant details. I will stick to the facts and just the facts. I will write down what I know, even if the formatting is simple.

I will always try to explain things as simply and clearly as I can, and tailor those conversations to the audiences I am addressing.

I will not waste time or effort on make-work. Everything I do will be necessary to keep the project moving forward and to get the software out to the users.

I will stay out of any politics. I will avoid drama and childish behaviour.

I will not continue to work on things that I know are causing harm to others. I may not be able to stop it, but I will not participate.

It’s not about money. It’s not about power. It’s not about ambition. It’s not about ego. It is only about building things that are good enough to actually help people.

I am a professional software developer. I will act professionally at all times.

Thursday, August 4, 2022

Architecture

I was reading an online discussion about architecture. It’s always been a rather odd topic in that it has actually been extremely well codified, at least a few times, but that knowledge always seems to get lost. So more often, people just define it as anything they want it to be, as long as it benefits their personal agenda.

Pretty much all that is wrong, but rather than point to a bunch of better, but old references, I’ll continue a long and painful tradition of throwing around my own definition.

A software system has an architecture if and only if it is organized. That is, if it is just a mess of stuff thrown together, then it has no architecture. It may have had an architecture in its early years, but if subsequent work ignored that, then it is just a pile of stuff now.

The process of establishing an architecture is part structural and part political. That is, for any non-trivial system, it is both so slow and so expensive to build that the act of arranging for the significant costs of the work to be covered is pure politics. All sorts of people have their fingers in this pie, so the technologies and stacks that will be picked as the foundations are just artifacts of the act of raising money. It’s true for startups, but it is also true for any large organization.

Each foundational dependency has an obvious set of strengths and weaknesses. You could rationally choose between them based on those properties to arrive at the best fitting pieces for any given solution. That almost never happens, politics drives people to choose irrationally, often based on prejudice, personality or limited experiences. So, generally the technologies and tech stacks are picked way in advance, long before the real work begins. Often they are poorly matched to the requirements. A lot of work goes into sloppy patches to cover over these weaknesses.

Once you get past the effects of politics, the rest is structural. There are two primary areas.

The first is decomposing related pieces together. That is, all of the code related to a behaviour of the system like reporting should be placed together. But it’s not actually that simple, in that there are always at least two dominant dimensions at play. The problem domain, too often called the business, imposes vertical constraints on the system, while the technical domain, as a foundational layer, imposes horizontal constraints. These are at odds with each other, although in modern times it has become more popular to just pay lip service to any large scale technical issues and blindly follow the business.

Still, you can organize the codebase, even if it is huge, into layers of nicely fitting boxes that clearly delineate the subcomponents at both a high and medium level. The strength of doing this early is that if the pieces are all independent from each other, the work can effectively be scaled (parallized) and the overall quality will be far better.

You can go further though. Instead of breaking down the mechanics by clumping together similar bits of code, you can organize it by fundamental composite data types. When this is correctly applied, it tends to mitigate the vertical and horizontal mismatches, but also simplifies both the code and any visualizations of it. You do see this happen in practice, but because it is a very advanced technique, it is more often applied to high quality low-level commercial products, than to the massive amounts of domain specific application code that is far more frequently written these days.

The second primary area is performance related. All systems have to fit into a larger environment. They share a lot of data, they are constantly getting data feeds and sending out data to other systems. At this level, there are standardized enterprise ways of interconnecting the different systems.

The core problem is usually frequency. One system may be able to generate or collect data far faster than another can import it, so there are common solutions like queues that get placed in between to prevent synchronization problems.

As there are literally 50 years of different data formats, communications tools and protocols and a nearly unlimited number of different right and wrong ways to model the data itself, so a great deal of time is spent on these data transfers. In some cases there are well established communications patterns like having a common data bus for the whole organization, but oddly, impatience more often means that these are ignored and each and every feed is badly home rolled. Often even if the internals of the system is tightly organized, its imports and exports are not, usually because the different teams responsible don’t like to agree on any standardized ways of moving data around. They think it takes too long to implement (while ignoring the fact that they are just reinventing the same wheels, over and over again).

There are a few other key architectural issues, like centralization, avoiding massive duplication, stale data and choosing between interface types like cli, native, web, mobile, etc. All of these have been heavily explored and resolved in the past, but again much of that knowledge has been forgotten too.

That’s pretty much architecture in a nutshell.

At least across parts of a large company, we could really be a lot more organized and save ourselves huge amounts of redundant work, but strangely we are unable to do that. It’s possible that the initial political component ends up overshadowing the technical issues, so that the initial loss of time spent there forces all of the following work to get horribly rushed, and then done badly. Not sure, but over the decades we’ve learned more than enough to be able to avoid this fate, yet it still seems to befall most large internal projects.

Our attempts at, instead, just trying to be fully reactive and then letting the work magically evolve on its own tend to court pure disorganization as the “architecture”, so they are usually far, far worse.

Either way we keep building things badly, then realizing they are bad, then restarting from scratch again. By now there are crappy big systems out there that are in their 7th or 8th generation. Maybe we should just do a better job of getting things organized first before we rush off to repeat the same mistakes again. Not making a mess on purpose is really the aim of architecture. You can tell how good people are at it, by the results that they leave behind.