The Programmer's Paradox: Data Flow

I like to write this post every few years. People generally skim over it, and then ignore what I am saying. So I’ll probably end up writing it a few dozen more times, then give up completely.

There is one major trick to keeping everything sane while building a big system:

Forget about the code.

Just ignore it.

The problem isn’t the code, it isn’t ever the code.

It’s the data.

If the data flows around the organization in a clean organized matter, then the code is just the secondary issue about the small translations needed as it moves about.

That is, if the data is good, then any problems with the code are both obvious and easily fixable.

If the data is bad or incomplete, the entire stability and trustworthiness of the system is busted. Collecting megabytes of useless data is an epic waste of time. There is no foundation for the work.

Also, stepping back and viewing the entire charade as just data flowing around from place to place is extraordinarily simpler than trying to grok millions of lines of code. Mostly, we collect some data from the outside and combine it with data collected from people in the middle of their problems. That’s it.

You should never be writing code to retroactively repair data. That is if the data doesn’t exist, or it is ambiguous, or it’s stored in a broken format, that is the real problem. Patching that with flaky code is not a real solution. Fixing the data is.

If you understand how the data needs to be structured to store it properly, and you honor that knowledge in the code as it moves around, then everything else is a thousand times easier. It’s code you need to write to move it here, or there. It’s code you need to write to translate it into another format. It’s code you need to write to combine some of it together to craft derived data. That’s pretty much it.

Then you can spend your creative energies on building up enough sophistication to really help people solve their real problems.

The Programmer's Paradox

Sunday, June 6, 2021

Data Flow

No comments:

Post a Comment