I use the term ‘software system’ loosely. I usually intend it to mean: all of the boundaries for a set of related solutions that have been or will be implemented with software.
In that sense, it is less about the technical parts of the ‘system’ and more about how they all come together to help people.
I do this mostly because I tend to visualize a ‘problem space’ as a flat 2D terrain. It is a convenient oversimplification. It is a big, wide, open, empty field of grass which spans over related problems.
When I am doing greenfield work, I see the start as picking one spot in that field. You start there, building up enough structure to be useful. First, you lay down some common foundations, then you start adding in functionality that implements the features you know will help solve the problem.
As you do this, people will see the effort and start making suggestions. Some will want to go off in one direction, while others will prioritize the opposite way.
The trick to keeping it all as usable as possible is to slowly expand out your borders, but not in too many directions all at once.
Someone once told me that for software, you should never pick a path unless you are willing to walk it. From this perspective, it usually means that you won’t expand into another area in the field haphazardly. If you do choose to go there, it needs to be done correctly. That is, adding a few really good, solid features is way better than adding a million lame ones.
The same is true for the data. If you need new data, you add it carefully, properly structured, or not at all.
Overall, though, you start at one specific spot and keep growing. If it’s a good set of programs and people find it valuable, you’ll probably be at it for years, if not decades. So, it’s really crucial to its lifespan that the work you do in the very early days is as good as it can be. It needs to be neat, tidy, organized, and carefully thought out.
With that in mind, calling the current work and any of the rather obvious future work a ‘system’ works quite well. The system isn’t the code, but rather it is all of the territory that the code is trying to cover at some point. You might build a system for handling the account problems in a large corporation, for example. There might be lots of included pieces, and even some nearly stand-alone sub-systems, but they are all trying to fit together to deal with the same problems.
So, it’s similar to seeing the forest through the trees. The boundaries of the expected work are the system, but the system may not stretch right up to those boundaries yet.
From this view, it makes it easier to understand a bottom-up implementation. You might not know all of the features that people will ask for, but you should have a reasonable sense of the territory you are covering right now. Lots of that territory is similar, so building reusable components and engines will really help in getting more ground covered at a faster rate.
The classic example is reporting. You know people will need it at some point, so instead of just hardcoding a couple of static examples, it would be better to either offload it to somewhere else as exported data, or write some generic engine that is flexible enough to cover its rather massive width. The trick is not to write a lot of code, but rather to leverage any code you do write to cover the largest parts of that territory. In software, a little foresight goes a long way.
Thinking of a software system this way really helps in making a lot of the implementation decisions. If your field doesn’t cover having a million users, then designing an architecture to support that scale doesn’t make any sense.
More importantly, if you are located in one corner of the field, then trying to expand way over to the other side of a nearby hill doesn’t make a lot of sense either. That’s far enough away that it is clearly another system, another project and thus another codebase.
Building even medium-sized software is surprisingly complicated, so finding ways to frame it nicely really helps with making better decisions. Since time is a precious resource, getting to the right code as quickly as possible is important. Seeing it all as a system occupying some territory in an endless field is a good guide.
The Programmer's Paradox
Software is a static list of instructions, which we are constantly changing.
Thursday, June 11, 2026
Thursday, June 4, 2026
Round Holes
A classic expression describes shoving a square peg into a round hole. Basically, you’re matching the wrong part with the wrong location.
I see this often in software architecture. Sometimes I like to use the term impedance mismatch. There is a component that someone is suggesting for use in a different variation of its problem space. It fits badly.
Sometimes the issue is access. There are hard limits on the availability of stacks, libraries, frameworks, tools, and services. In some organizations, they need to be vetted and approved first. This can be very slow, but I still think a good idea; using too many technologies is a nightmare.
In some cases, it is knowledge. People tend to gravitate to the things they already know, so they’ll prefer technologies they’ve used in the past, even if it means forcing them into place. That makes little sense if the architect has that preference, but it’s a different development team that does the construction.
Sometimes it is a misunderstanding. The marketing for the component says it will do everything, perfectly, but the reality is that it is far more than a stretch. Hastily cobbled together features to help sales. Still, the people funding the effort got swayed, so now everyone else is forced to jam weak pieces into the wrong place.
The habit I’ve always encouraged is to look carefully and admit that a round hole is, in fact, just a round hole.
I see this often in software architecture. Sometimes I like to use the term impedance mismatch. There is a component that someone is suggesting for use in a different variation of its problem space. It fits badly.
Sometimes the issue is access. There are hard limits on the availability of stacks, libraries, frameworks, tools, and services. In some organizations, they need to be vetted and approved first. This can be very slow, but I still think a good idea; using too many technologies is a nightmare.
In some cases, it is knowledge. People tend to gravitate to the things they already know, so they’ll prefer technologies they’ve used in the past, even if it means forcing them into place. That makes little sense if the architect has that preference, but it’s a different development team that does the construction.
Sometimes it is a misunderstanding. The marketing for the component says it will do everything, perfectly, but the reality is that it is far more than a stretch. Hastily cobbled together features to help sales. Still, the people funding the effort got swayed, so now everyone else is forced to jam weak pieces into the wrong place.
The habit I’ve always encouraged is to look carefully and admit that a round hole is, in fact, just a round hole.
I don’t start with the square pegs; that is my last step. Not surprisingly, this can frustrate people, in that there might not be any round pegs available. They don’t exist or can’t be used. For me, though, I still want to know that the hole is round, even if I can’t fill it correctly right now, or in some cases, ever.
But if you can do that, and imagine for any bunch of holes what would fit rather perfectly, first, before getting lost in the messiness, it will really help both simplify the effort and get it as good as possible.
But if you can do that, and imagine for any bunch of holes what would fit rather perfectly, first, before getting lost in the messiness, it will really help both simplify the effort and get it as good as possible.
Otherwise, you risk unintentionally creating a Rube Goldberg machine. For people unfamiliar with those machines, they are works of art that are deliberately overcomplicated. Just a collection of mismatching pieces made to do something interesting. They make great entertainment, but are not something that you’d want to have to rely on.
I’ve seen that too often in enterprise architecture, systems built out of odd, mismatching components sloppily glued together into a giant house of cards. That, paired with excessive brute force for the glue, tends to generate an endless amount of support and bug fixing, while never really working correctly. The system exists, but it is just off by enough that it would be better if it didn’t. It’s a time sink. Now, instead of a solution, it is an ill-placed speed bump.
Often, to avoid that fate, I want to just look at the way the data needs to flow around at the high, rather abstract level.
I’ve seen that too often in enterprise architecture, systems built out of odd, mismatching components sloppily glued together into a giant house of cards. That, paired with excessive brute force for the glue, tends to generate an endless amount of support and bug fixing, while never really working correctly. The system exists, but it is just off by enough that it would be better if it didn’t. It’s a time sink. Now, instead of a solution, it is an ill-placed speed bump.
Often, to avoid that fate, I want to just look at the way the data needs to flow around at the high, rather abstract level.
You need to get the major entities from other sources, persist it all, and then deliver to interfaces, reporting, other systems, etc. If you understand the amount of information, its timeliness, and frequency, you start to get a sense of the minimum pegs you need below it. If you can grok that, then you can start the torturous phase of trying to see what is actually available and whether or not it is close enough to be workable. But if you go the other way and pick the components first, you’ll quickly get lost in gluing together odd parts for no reason.
It’s the same form of thinking that is needed to get good, simple, clean code, too. You have to see it first from a top-down perspective, before trying to build it up from what’s already available. It’s really the only way to keep from getting lost, but also to leverage reuse, encapsulation, etc. You want to know the scope of the problem first, come up with a near-perfect solution, and then map that impossible solution back to things that are possible. It probably sounds a bit crazy to people who can’t see it that way, but it is a perspective that anyone can learn to leverage. A superpower of sorts.
So we can get there with three easy questions.
It’s the same form of thinking that is needed to get good, simple, clean code, too. You have to see it first from a top-down perspective, before trying to build it up from what’s already available. It’s really the only way to keep from getting lost, but also to leverage reuse, encapsulation, etc. You want to know the scope of the problem first, come up with a near-perfect solution, and then map that impossible solution back to things that are possible. It probably sounds a bit crazy to people who can’t see it that way, but it is a perspective that anyone can learn to leverage. A superpower of sorts.
So we can get there with three easy questions.
- What is the ‘full’ scope of the problem?
- What would solve this perfectly?
- What’s available to approximate that perfect solution?
In an enterprise that might be building up a replacement system for tracking some type of inventory or case management. The primary features are pretty well known; the useful secondary ones are findable with a bit of investigation.
Perfection might be a dynamic data store to accommodate wide but slowly changing shallow data. The users need a nice GUI to get at this and keep control. The incoming data is real-time, vibrates occasionally, so a queue would protect it and help with integrity. The system feeds a few others that specialize in other forms of management. It’s always a smallish number of people. It should all run in a managed environment.
This then is the hole that needs to be filled in with whatever technologies are available now, in the future, or can be suitably crafted in a “reasonable” time.
Contrast that with something where the data rarely changes, there are millions of users constantly accessing it, and they are the primary source of the data. It’s a very different hole that likely needs industrial-strength pegs in order to keep it going. It’s not a system running on one or two boxes, but requires a large cluster of machines all cooperating to cope with its huge and variable load. The scale is so large that there is no overlap with that first medium system, so it’s unlikely that they should share any common technologies. It’s more of a star-shaped hole, needs special stuff to fill it.
The converse is also true, in that any of the technologies suitable for the second design would be grossly over-engineered for the first one. You can’t just cherry-pick a few and shove them into place. One is a 2D circle that needs to be painted, the other is a 3D hole that needs to be filled.
In that sense, you learn as much as you can about the full width of the problem, then let your imagination run wild with getting it perfect. With those boundaries in place, you can start picking the fewest number of pieces that come close to filling it. There will be ugliness and rough edges, but you’ve found them early and minimized them, which is the best you can do if you can’t just build it all from the metal to the top.
Perfection might be a dynamic data store to accommodate wide but slowly changing shallow data. The users need a nice GUI to get at this and keep control. The incoming data is real-time, vibrates occasionally, so a queue would protect it and help with integrity. The system feeds a few others that specialize in other forms of management. It’s always a smallish number of people. It should all run in a managed environment.
This then is the hole that needs to be filled in with whatever technologies are available now, in the future, or can be suitably crafted in a “reasonable” time.
Contrast that with something where the data rarely changes, there are millions of users constantly accessing it, and they are the primary source of the data. It’s a very different hole that likely needs industrial-strength pegs in order to keep it going. It’s not a system running on one or two boxes, but requires a large cluster of machines all cooperating to cope with its huge and variable load. The scale is so large that there is no overlap with that first medium system, so it’s unlikely that they should share any common technologies. It’s more of a star-shaped hole, needs special stuff to fill it.
The converse is also true, in that any of the technologies suitable for the second design would be grossly over-engineered for the first one. You can’t just cherry-pick a few and shove them into place. One is a 2D circle that needs to be painted, the other is a 3D hole that needs to be filled.
In that sense, you learn as much as you can about the full width of the problem, then let your imagination run wild with getting it perfect. With those boundaries in place, you can start picking the fewest number of pieces that come close to filling it. There will be ugliness and rough edges, but you’ve found them early and minimized them, which is the best you can do if you can’t just build it all from the metal to the top.
Thursday, May 28, 2026
Versioning
If you start from the premise that a system is just a series of access points into a vast array of computations, then if you accept that there will always be a huge number of changes to this underlying code, you see why this is messy.
All the computer is really doing is taking a bunch of inputs, grinding through computation, then spitting it out. But we often end up changing these computations, sometimes because we had obvious or subtle bugs, sometimes because we’ve acquired new knowledge about how to do the work better, more accurately, or much faster.
At the high level, people interact with all of these access points, apply some variability to them, and then set the computations in motion. It might take a millisecond, an hour, or even a few days. The interaction might be rapid (real-time-ish), or it might just be infrequent. What is important to these people is that they can trust that the computer does the thing that they expect it to do. Trust is the bedrock.
We can skip over any sort of difference; the output is either a blob of text or a pretty little graphic of some type, it’s all just variations on presentation. The text could be typed and structured, which doesn’t matter either.
In order to trust the system (app, program, plugin, etc.), one key property is that the behaviour has to be ‘stable’. It should not change day to day, hour to hour. If you used it yesterday to do something, then you expect that with the same inputs, it will do pretty much the exact same thing (determinism). An added expectation is that if there were changes, then mostly those changes would be adding more features, not changing the old ones.
This is the core of what we call backward compatibility. Most programmers think of it in terms of APIs they are calling that are stable, but really, it is an overall property of the system itself. It is backward compatible if and only if any interactions, code, or humans are fully preserved after endless updates. If it worked ten years ago, it will work exactly the same today.
There is a loose exception for bug fixes, particularly bugs that have rendered the functionality to be useless. These are obviously not expected to be backward compatible, as the old behaviour does not correctly match expectations, so it needs to be changed to something else.
This relation is expressed nicely in using three-digit version numbers.
The last digit is bumped up for 1 or more bug fixes. Going from n.n.10 to n.n.11 means that at least one bug was fixed, maybe a dozen of them.
The second last digit is listed as a minor enhancement, but honestly, it is really just there for added functionality. Nothing else changed, nothing was deleted. So, if n.10.n is bumped up to n.11.n you expect backward compatibility for all of the existing functionality, and there is now some new functionality included.
That leaves the first digit to clearly state that you have broken backward compatibility. It is essentially a flag. There is some change that is major enough that the user is not necessarily going to be able to expect the code to be deterministic. Something big changed, and it will be noticed.
If people were strict in the usage of version numbers, and if they respected the notion that any 0.n.n version was just an initial demo or test instance, then even if the system was under active development for years, if it was backward compatible, then 1.9343523.14 would be a reasonable version number. 9M times new features were added, but the rest is still intact. Lots of stuff has been added, but all of it is backward compatible. The last round of added features needed 14 tries to get the bugs knocked out. All of these should have been in testing.
As it is with user interfaces, it is true for any pure computational dependencies below. Libraries, frameworks, languages, tools, etc. Strict usage of the version numbers is enough to get a very strong sense of both how the development is going and whether the authors even understand backward compatibility.
Probably the most embarrassing self-inflicted mistake a software developer can make is to push out a release that immediately crashes due to an underlying dependency change. If they were doing things reasonably, this would never occur. At minimum, an embarrassing non-backward-compatible library change would get picked up in testing. Untested code should never, ever get into a release. Subtle changes could slip through, but at the bare minimum, the work should have been smoke tested to catch exactly this sort of mistake. But the stronger habit is to only upgrade questionable libraries at the beginning of a long development cycle, while also doing lots of non-destructive refactoring. That is, before dumping in new stuff, you tidy up the junk from the last release and update some of the libraries. Run it a lot yourself until you are sure it is stable, then you go to town to add new stuff.
If the library is any good, and it has done an excellent job at being backward compatible, this is extremely low risk. You can kinda cheat the game sometimes. But if it is some dodgy little thing written by a couple of people as an advertising attempt, then you would have to wrap it in very expensive testing for each and every little thing you’ve used it for. It’s this that makes most libraries not worth integrating, either because the testing is too much work or the risk is just way too high. Reading the code and applying some of its better ideas is more suitable.
Often, you can get a sense of the quality of the library just by looking at the version numbers. For instance, 24.3.2 is a suspicious number if the work is only a couple of years old. They’re not taking backward compatibility very seriously; they are high risk.
It comes across with some of the larger tech stacks, too. If there is a major version bump that fundamentally breaks all backward compatibility, but someone has the newer and older versions haphazardly laid on top of each other, you pretty much know that the confusion caused by being too loose with the versioning is going to cause a lot of chaos that will either waste a lot of your time or result in embarrassing bugs. If the break was wide enough, the new work should really abandon the ‘brand’ of the old work. They are two different things, even if that means it is harder now for the new version to get a lot of traction. Just because you decided to change it doesn’t mean everyone else in the world should change too. Once you’ve committed to a particular set of computations, you have to stay committed and only grow from there. You can’t just pick up and move to some other spot farther away and claim it is the same work; it is not.
Backward compatibility is hard, really hard, which is why everyone loves to cheat the game so much. But it is an essential property of stability, which is necessary for trust. If you want to do a good job providing some complex computations to others, it is going to be hard. There is no way to avoid it. If you do the hard work, then you can communicate it quite clearly with the version numbers. That will let people know that your work is serious.
All the computer is really doing is taking a bunch of inputs, grinding through computation, then spitting it out. But we often end up changing these computations, sometimes because we had obvious or subtle bugs, sometimes because we’ve acquired new knowledge about how to do the work better, more accurately, or much faster.
At the high level, people interact with all of these access points, apply some variability to them, and then set the computations in motion. It might take a millisecond, an hour, or even a few days. The interaction might be rapid (real-time-ish), or it might just be infrequent. What is important to these people is that they can trust that the computer does the thing that they expect it to do. Trust is the bedrock.
We can skip over any sort of difference; the output is either a blob of text or a pretty little graphic of some type, it’s all just variations on presentation. The text could be typed and structured, which doesn’t matter either.
In order to trust the system (app, program, plugin, etc.), one key property is that the behaviour has to be ‘stable’. It should not change day to day, hour to hour. If you used it yesterday to do something, then you expect that with the same inputs, it will do pretty much the exact same thing (determinism). An added expectation is that if there were changes, then mostly those changes would be adding more features, not changing the old ones.
This is the core of what we call backward compatibility. Most programmers think of it in terms of APIs they are calling that are stable, but really, it is an overall property of the system itself. It is backward compatible if and only if any interactions, code, or humans are fully preserved after endless updates. If it worked ten years ago, it will work exactly the same today.
There is a loose exception for bug fixes, particularly bugs that have rendered the functionality to be useless. These are obviously not expected to be backward compatible, as the old behaviour does not correctly match expectations, so it needs to be changed to something else.
This relation is expressed nicely in using three-digit version numbers.
The last digit is bumped up for 1 or more bug fixes. Going from n.n.10 to n.n.11 means that at least one bug was fixed, maybe a dozen of them.
The second last digit is listed as a minor enhancement, but honestly, it is really just there for added functionality. Nothing else changed, nothing was deleted. So, if n.10.n is bumped up to n.11.n you expect backward compatibility for all of the existing functionality, and there is now some new functionality included.
That leaves the first digit to clearly state that you have broken backward compatibility. It is essentially a flag. There is some change that is major enough that the user is not necessarily going to be able to expect the code to be deterministic. Something big changed, and it will be noticed.
If people were strict in the usage of version numbers, and if they respected the notion that any 0.n.n version was just an initial demo or test instance, then even if the system was under active development for years, if it was backward compatible, then 1.9343523.14 would be a reasonable version number. 9M times new features were added, but the rest is still intact. Lots of stuff has been added, but all of it is backward compatible. The last round of added features needed 14 tries to get the bugs knocked out. All of these should have been in testing.
As it is with user interfaces, it is true for any pure computational dependencies below. Libraries, frameworks, languages, tools, etc. Strict usage of the version numbers is enough to get a very strong sense of both how the development is going and whether the authors even understand backward compatibility.
Probably the most embarrassing self-inflicted mistake a software developer can make is to push out a release that immediately crashes due to an underlying dependency change. If they were doing things reasonably, this would never occur. At minimum, an embarrassing non-backward-compatible library change would get picked up in testing. Untested code should never, ever get into a release. Subtle changes could slip through, but at the bare minimum, the work should have been smoke tested to catch exactly this sort of mistake. But the stronger habit is to only upgrade questionable libraries at the beginning of a long development cycle, while also doing lots of non-destructive refactoring. That is, before dumping in new stuff, you tidy up the junk from the last release and update some of the libraries. Run it a lot yourself until you are sure it is stable, then you go to town to add new stuff.
If the library is any good, and it has done an excellent job at being backward compatible, this is extremely low risk. You can kinda cheat the game sometimes. But if it is some dodgy little thing written by a couple of people as an advertising attempt, then you would have to wrap it in very expensive testing for each and every little thing you’ve used it for. It’s this that makes most libraries not worth integrating, either because the testing is too much work or the risk is just way too high. Reading the code and applying some of its better ideas is more suitable.
Often, you can get a sense of the quality of the library just by looking at the version numbers. For instance, 24.3.2 is a suspicious number if the work is only a couple of years old. They’re not taking backward compatibility very seriously; they are high risk.
It comes across with some of the larger tech stacks, too. If there is a major version bump that fundamentally breaks all backward compatibility, but someone has the newer and older versions haphazardly laid on top of each other, you pretty much know that the confusion caused by being too loose with the versioning is going to cause a lot of chaos that will either waste a lot of your time or result in embarrassing bugs. If the break was wide enough, the new work should really abandon the ‘brand’ of the old work. They are two different things, even if that means it is harder now for the new version to get a lot of traction. Just because you decided to change it doesn’t mean everyone else in the world should change too. Once you’ve committed to a particular set of computations, you have to stay committed and only grow from there. You can’t just pick up and move to some other spot farther away and claim it is the same work; it is not.
Backward compatibility is hard, really hard, which is why everyone loves to cheat the game so much. But it is an essential property of stability, which is necessary for trust. If you want to do a good job providing some complex computations to others, it is going to be hard. There is no way to avoid it. If you do the hard work, then you can communicate it quite clearly with the version numbers. That will let people know that your work is serious.
Thursday, May 21, 2026
Feedback
Recently, my blog has been getting a lot more views. Unfortunately, a lot of the incoming fields for these reads are just tagged with ‘Other’.
That tells me that the traffic is not coming from the older established sites I know, like HackerNews, but is either fake traffic or newer sites that I haven’t seen. It would be nice to know which is correct. Are people actually reading these posts?
So, if you are reading this, I’d really appreciate you taking a moment to comment. Anonymous is fine, and since my comments are screened before they are published, feel free to say ‘do not publish’ if you want. A ping is good; mentioning the source helps.
A long time ago, I briefly dreamed of monetizing my writing, but as I realized that the way to do that is to effectively change what I am saying, I decided not to do that.
I’ve always had to be careful not to upset any of my current employers, but beyond that, I write what I know, either from firsthand experience or from conversations with others.
Because of that, and my limited writing style, it’s never been a popular blog, but I still feel, after decades, that I want to get what I understand down somewhere. Maybe people read it, maybe not. It’s okay.
The software development industry varies hugely, so not surprisingly, plenty of other people have had very different experiences in their careers, but I also do suspect that there is way too much propaganda out there that is deliberately trying to mislead people. It’s an immature, messy and often ugly industry.
With all that in mind, if you could take a moment to say ‘Hi,’ at least I’ll know if you really exist or are just a figment of the web’s imagination.
That tells me that the traffic is not coming from the older established sites I know, like HackerNews, but is either fake traffic or newer sites that I haven’t seen. It would be nice to know which is correct. Are people actually reading these posts?
So, if you are reading this, I’d really appreciate you taking a moment to comment. Anonymous is fine, and since my comments are screened before they are published, feel free to say ‘do not publish’ if you want. A ping is good; mentioning the source helps.
A long time ago, I briefly dreamed of monetizing my writing, but as I realized that the way to do that is to effectively change what I am saying, I decided not to do that.
I’ve always had to be careful not to upset any of my current employers, but beyond that, I write what I know, either from firsthand experience or from conversations with others.
Because of that, and my limited writing style, it’s never been a popular blog, but I still feel, after decades, that I want to get what I understand down somewhere. Maybe people read it, maybe not. It’s okay.
The software development industry varies hugely, so not surprisingly, plenty of other people have had very different experiences in their careers, but I also do suspect that there is way too much propaganda out there that is deliberately trying to mislead people. It’s an immature, messy and often ugly industry.
With all that in mind, if you could take a moment to say ‘Hi,’ at least I’ll know if you really exist or are just a figment of the web’s imagination.
UPDATE: Ok, I got a few responses, which is great. Thanks! Seems like at least some of the traffic is RSS and Atom, which doesn't show up in the stats. It might be those views where I do get a country and browser type, but that still leaves a great deal of traffic as Other. I guess I'll never know if those are real or not.
If anyone has suggestions about future topics, that would be great too. I feel like I am getting too repetitive in my old age :-)
Thursday, May 14, 2026
Security
Programmers hate adding security to their systems.
First, it is a huge amount of work, and second, since it is so often left to the end, it is very ugly and disruptive work. A patchwork of hacks.
But it’s misdirected. Without enough security, the system they built is useless, well, worse than useless. If people use it, it could severely screw them over. Nobody would intentionally use something that helps criminals more than it helps them. Even if it is in a walled garden, you can never really be sure that someone isn’t motivated to take a peek.
It’s worth noting that I am not a security expert, and although I’ve had to deal with it a lot in my career, my practice might not be as strong as the experts would like. That being said, I’ll continue.
There are only a few things you need to worry about in security. First is actually identifying any ‘users’. You always have to know who they are and have enough confidence in that decision that you don’t make a mistake.
Then the other part of it is that you want to protect both the data and the code from anyone who isn’t supposed to see or activate it. It’s not enough to protect just the data or just the code. You need both.
In that sense, security isn’t that hard if it is your concern from day one. There are a bunch of entry points that people will use to get to the features and functionality. First, you identify them, then you check to see if they can access the given functionality. If they can, then lower, you check to see if they can access all of the data input into that functionality. If they can, then they can see the output. Simple 🙂
Here’s where the trouble starts. First, there should be no anonymous endpoints. But people love adding them, but they open the door to leaks or denial of service attacks. If you have none, though, all of that goes away. If you can’t quickly identify someone right at the top, punt them immediately, send a log to some administrator. They might have to block the incoming address or put up some firewalls to stop botnets and other nasty things. You always need to flag a punted user as a serious problem.
Second is databases. For capitalist reasons, they charge by users, so the system users are not the same as the database users. That sucks, and it has always sucked. Life would be pretty easy if a person’s identity propagated all the way down to the metal. It should.
If there was a necessity for a group or functional account shared by a bunch of people, then the group is the identity, and that identity goes all the way down.
If your database or its license makes that impossible, then you need to wrap it. You need to wrap it thoroughly enough that pretty much nobody can get to it in any way without first passing an identity check. So, not just in the backend code, but also on the machine in the scripts, with the OS, etc. Everywhere.
Wrap the database. It won’t make it convenient, since that is the opposite of secure, but you need to do it.
Now, at the top, after you have checked identity, you take a quick look at whatever functionality is called. Are they allowed to use it? In some extreme cases, that is a messy lookup table, and it needs to be managed by data admins. It’s annoying, but really, in a large organization, that really should be a distinct piece that is shared by a whole bunch of systems. You just check with it, user X wants to call foo, is that okay?
If that’s good, then as the code executes and hits the wrapped database, the second check will trigger on the data. If it’s good, then it is all done. If you always reuse both the high and lower levels, then the security will be everywhere, and you don’t have to lie awake at night worrying about it failing.
The only other part is that if a user ever sends you ‘code’, you laugh and reject it. If you want some cool dynamic execution feature, great, but there have to be two paths, not one. The code comes in from somewhere else, having been fully and completely vetted, and then the user later asks for it to execute dynamically. That keeps it really simple, and sets you up with some external means for this uber dangerous code to be properly managed, vetted, and approved. That in itself is a huge task; you can’t just ignore it and hope for the best. Dynamic code can never be ‘open’ dynamic code; it has to be closed and come from a reliable source that actually has to be more reliable than just reliable.
So, in the end, if you wrap the database, always identify everyone, manage a lookup table or two, and punt anything that could ever be possibly executed by any downstream party or library, then you’re done. All of this code is reusable; you just need to do it once, at the beginning of the project, then leverage it for success and glory.
First, it is a huge amount of work, and second, since it is so often left to the end, it is very ugly and disruptive work. A patchwork of hacks.
But it’s misdirected. Without enough security, the system they built is useless, well, worse than useless. If people use it, it could severely screw them over. Nobody would intentionally use something that helps criminals more than it helps them. Even if it is in a walled garden, you can never really be sure that someone isn’t motivated to take a peek.
It’s worth noting that I am not a security expert, and although I’ve had to deal with it a lot in my career, my practice might not be as strong as the experts would like. That being said, I’ll continue.
There are only a few things you need to worry about in security. First is actually identifying any ‘users’. You always have to know who they are and have enough confidence in that decision that you don’t make a mistake.
Then the other part of it is that you want to protect both the data and the code from anyone who isn’t supposed to see or activate it. It’s not enough to protect just the data or just the code. You need both.
In that sense, security isn’t that hard if it is your concern from day one. There are a bunch of entry points that people will use to get to the features and functionality. First, you identify them, then you check to see if they can access the given functionality. If they can, then lower, you check to see if they can access all of the data input into that functionality. If they can, then they can see the output. Simple 🙂
Here’s where the trouble starts. First, there should be no anonymous endpoints. But people love adding them, but they open the door to leaks or denial of service attacks. If you have none, though, all of that goes away. If you can’t quickly identify someone right at the top, punt them immediately, send a log to some administrator. They might have to block the incoming address or put up some firewalls to stop botnets and other nasty things. You always need to flag a punted user as a serious problem.
Second is databases. For capitalist reasons, they charge by users, so the system users are not the same as the database users. That sucks, and it has always sucked. Life would be pretty easy if a person’s identity propagated all the way down to the metal. It should.
If there was a necessity for a group or functional account shared by a bunch of people, then the group is the identity, and that identity goes all the way down.
If your database or its license makes that impossible, then you need to wrap it. You need to wrap it thoroughly enough that pretty much nobody can get to it in any way without first passing an identity check. So, not just in the backend code, but also on the machine in the scripts, with the OS, etc. Everywhere.
Wrap the database. It won’t make it convenient, since that is the opposite of secure, but you need to do it.
Now, at the top, after you have checked identity, you take a quick look at whatever functionality is called. Are they allowed to use it? In some extreme cases, that is a messy lookup table, and it needs to be managed by data admins. It’s annoying, but really, in a large organization, that really should be a distinct piece that is shared by a whole bunch of systems. You just check with it, user X wants to call foo, is that okay?
If that’s good, then as the code executes and hits the wrapped database, the second check will trigger on the data. If it’s good, then it is all done. If you always reuse both the high and lower levels, then the security will be everywhere, and you don’t have to lie awake at night worrying about it failing.
The only other part is that if a user ever sends you ‘code’, you laugh and reject it. If you want some cool dynamic execution feature, great, but there have to be two paths, not one. The code comes in from somewhere else, having been fully and completely vetted, and then the user later asks for it to execute dynamically. That keeps it really simple, and sets you up with some external means for this uber dangerous code to be properly managed, vetted, and approved. That in itself is a huge task; you can’t just ignore it and hope for the best. Dynamic code can never be ‘open’ dynamic code; it has to be closed and come from a reliable source that actually has to be more reliable than just reliable.
So, in the end, if you wrap the database, always identify everyone, manage a lookup table or two, and punt anything that could ever be possibly executed by any downstream party or library, then you’re done. All of this code is reusable; you just need to do it once, at the beginning of the project, then leverage it for success and glory.
Subscribe to:
Posts (Atom)