Sunday, December 12, 2021

Upside Down

Why is git so difficult to use? Way back, after I had already switched from CVS to Subversion, I finally started using Mercurial. I found the transition to be quick and easy; never had a problem getting it to do the things that I needed it to do. it was good, I really liked it. But later, I switched to git.

For whatever reason, I can’t seem to “get” git. It’s annoying and awkward and I pretty much have to keep looking up how to do stuff, even for a lot of simple, routine operations. Is it me?

I don’t think so. I think the problem is the way it was built. Someone once described Mercurial as James Bond and git as McGyver. Even though I’ve forgotten who said it, that analogy always stuck deeply in my head.

It seems that git is focused on doing the “right” things with the underlying data structures. It’s a programmer’s vision of the way that the data is stored. It doesn’t care about the users, it doesn’t want to be a nice and polite tool. It just exposes a rather huge, somewhat convoluted CLI, to get to the underlying data, and then you are left on your own to figure out how to map that back to the things you really need to do. Basically, it lacks any and all empathy for the people who end up having to use it on a daily basis.

And that would be very different from any tool that took a top-down approach to building up the functionality for people to accomplish their goals. That sort of high-level entry point into figuring out what code is needed to solve a problem starts right with the people doing the work and then maps it back down to the work that needs to be done. So, it’s cleaner and more organized around those higher perspectives.

Now it’s true that if you are coding from a bottom-up perspective it would be considerably faster and you’ll get a lot more reuse, so way less code and way less testing. As a way to build things, the bottom-up approach is clearly superior. But the big trick is that you don’t go all of the way back up to the top while doing that. Git is a good example of why this ends up just being awkward, and although it is usable, it wastes an awful lot of people’s time on what are essentially make-work issues.

This is why, for development, we always want to consider what we write from a top-down perspective first. We need to start there, we need to anchor the work in what the users need. What they really need. But we should switch and build it from a bottom-up approach afterward. The two sides really have to meet somewhere in the middle, where they are the least expensive. So it’s a bit of a flip-flop to get it right.

If you are curious, that is why so many development process tools and methodologies are also a mess. They usually pick the top-down side and force it on everyone. That makes sense from the design side, but it’s completely wrong from the construction side. It ends up promoting brute force as the construction style, which ends up crafting extreme amounts of redundant fragile code, which is inherently unstable and needs copious amounts of testing. So we end up doing way too much work that is forced into low quality because it takes way too long.

But if you go to the other extreme, it is wrong too as git so elegantly showed, and we keep ping-ponging between these two poles, without figuring out that the place we actually need to be is right there in between. I guess it’s a tendency to try to reduce everything to binary terms, just 2 colors, so we can overly crude logic on top. Multi-dimensional gradients are considered too complex, even if they do match our world properly. Probably also related to why most code doesn’t handle errors properly. It’s not just “works” or “error”, different types of errors often require very different types of processing. It’s the Achilles heel of the coding industry. We’re always in a crazy rush these days, so we cling to gross over-simplifications at our base, but those usually end up wasting even more time. just making it worse.

Thursday, November 11, 2021

Non-destructive Refactoring

In order to evolve code, over increasingly sophisticated versions, you have to learn and apply a fairly simple skill for any language or tech stack.

It’s called non-destructive refactoring.

Although it has a fancy name, refactoring is really just moving around your lines of code and changing the way you partition them with abstract language concepts like functions, methods, and objects. There are lots of good books and references on refactoring, but few of them do a really good job at explaining why you should use the different techniques at different times.

The non-destructive adjective is a directive to constrain the wholistic effect of your changes. That is, you take an existing piece of working code, apply a bunch of refactoring to it, and when you are done the code will work 100% exactly as it did before. There will be no changes that users or even other programmers can or should notice.

All that has happened is that you’ve kept the same underlying instructions to the computer, but you have just moved them around to different places. The final effect of running those instructions is exactly the same as before.

A huge benefit of this is that you can minimize your testing. You don’t have to test new stuff, you just have to make sure what is there now is indistinguishable from what was there before.

If you are thinking that this is pretty useless, you are tragically mistaken.

The intent in writing code generally starts out really good. But the reality is that as the deadlines approach, programmers get forced to take more and more icky shortcuts. These shortcuts introduce all sorts of problems, that we like to refer to those as technical debt.

Maybe it’s massive over-the-top functions that combine too much work together. Maybe it’s the opposite, the logic is chopped up into tiny pieces and fragmented all across the source files. Maybe it is a complete lack of error handling. It doesn’t matter, and everyone was their own unique bad habits. But it does mean that as the code gets closer to the release, the quality of the code you are producing itself degenerates, often rapidly and to pretty low standards.

So, after a rather stressful release, possibly the worst thing you can do is just jump right back in again and keep going from where you left off. Because where you left off was hacking away badly at things in order to just toss the code out there. That is not a reasonable place to restart.

So, instead, you spend a little bit of time doing non-destructive refactoring.

You clean up the names of things, move code up and down in the function stack, add comments and decent error handling, etc. You basically fix all of the bad things you did right at the end and try to get yourself back to the code you wanted, not the code you ended up writing.

If you blurred the lines between layers, you correct that. If you skipped adding a layer, you put it in. If you added way too many layers, you flatten them down somewhat. You stop hardcoding things that would be nice to change and put them into cleaned-up configuration files. You document the things you missed.

If you have multiple copies of the same function all over the place, you pick one or merge the best parts, then you spend the time to point all of the other code back to that correct version. That’s one of the few places where this type of work isn’t truly non-destructive. One copy of the function maybe have been incorrect, you are rather accidentally fixing a bug or two by repointing it to something better. Because of that, you need to go up to everything that called that function and double-check that it was really calling it properly.

If you know that your next big work is building on top of your foundations, you rearrange things to make that work easier but restrict it to only changes that are non-destructive.

You take the time to do the right thing and to attend to all of the little details that you had to forget about earlier. You clean up your work first before you dive back in again.

If you keep doing this, even if you never seem to find enough time to do all of the work you know you should do, you will find that gradually, cleanup session after cleanup session, that you are moving in the right direction, and that the more often you do this, the less work you end up having to do for each release. That is, really messy code imposes a massive amount of friction which slows you down a lot. That friction is gone if the code is clean, so any work to get the code cleaner is also working to save you time and pain in the not too distant future.

Once you’ve mastered non-destructive refactoring, the other major skill is to extend what is already there (which is pretty clean cause you’ve made sure it is) instead of just bolting on more junk to the side. That is another super-strong habit that really helps to keep development projects ‘in control’ and thus make them a lot more fun to work on.

Sunday, November 7, 2021

It’s Slow, We Should Rewrite it

Whenever a programmer says “it’s slow, we should rewrite it” there is probably like an 80% chance that that programmer doesn’t know what they are talking about. As a refrain, it’s pretty close to “works on my machine” for professional malfeasance.

The first issue is that X amount of work cannot be done in any less than X amount of time. In some super cool rare situations, one can achieve a logarithmic reduction of the work, but those types of optimizations do not come easily. It might be the case that the code is de-optimized, but it would be foolish to assume that without actually doing some performance testing.

The driving issue though is usually that the person making that suggestion does not want to actually go and spend time reading the older code. They like to write stuff, but they struggle to read it. So, as their offensive argument, they picked some attribute that is visible outside of the code itself, proclaim that that is the real problem, and use that as their justification to just throwing away someone else's work and starting over from scratch, without bothering to validate any of their assumptions.

Sometimes they have arrived at that argument based on a misunderstanding of the underlying technologies. They often assume that newer technologies are inherently better than older ones. Ironically, that is rarely the case in practice, the software crisis dictates that later programmers understand less of what they are actually doing, so it’s less likely that their repeated efforts will be better in general. It is true that there is some newer knowledge gained, which might feed into improved macro or micro optimizations, but those benefits can be lost to far more de-optimizations, so you can see why that is a really bad assumption. On top of all of that, the software industry has been rather stagnant for actual innovations for a long time now, most supposedly new technologies are just variations on older already existing themes. It just cycles around endlessly these days. Whatever is old is new again.

With all of that added up, you can’t say that an implementation in tech stack A would be faster or better than one in B. It’s not that simple. That’s been true for decades now. There are some pretty famous cases of people going right back to older technologies and using them to get far better results. The tech stack matters for other reasons, but usually not for performance or quality.

About the only thing you can say about one implementation is that it is a whole lot messier and disorganized than another. That the quality of work is poor. It’s just a pile of stuff hastily thrown together. But you cannot know that unless you actually dive in and read, then understand the code itself. You can’t look at 5% of it and draw that conclusion. And any outside behaviors are not enough to make those types of assertions.

Overall it is rarely ever a good idea to rewrite software anymore. There are times and situations when that changes, but it hasn’t been the default for a long, long time. The best alternative is to start refactoring the code so that you keep all of the knowledge that has already built up in it, and learn to leverage that into something that exceeds the scope and quality of the original code. You can’t do that by refusing to read it, or by ignoring the knowledge that went into it. If the code was in active development for a decade, then rewriting it would literally set you back 10 multiplied by the number of programmers involved over that period. Which is a huge backslide, and highly unlikely to be successful in any time frame. It takes an incredibly long time to slowly build up code, so even if it isn’t perfect, it represents a lot of time and knowledge. You need to mine and leverage that work, not just toss it blindly into a dustbin. It’s quite possible that the coders who wrote that original work were a lot better at their jobs than you are at yours.

Sunday, August 15, 2021

Self-describing Names

If you are trying to keep track of an X in a system and you name any variable that holds information about it as X then your intent it’s really obvious.


If you call the variable that holds the data something like ‘foorbarizilla’ then you’ve intentionally misnamed that data in the system. If it’s foobarInfo or foobarData, that isn’t any better. Nor is xFactory, prj1_X or ptrX. 


But it’s not just variables. If the function should be called Y but instead it’s called GetN, you messed up. If the process tracks Zs but it says it is monitoring Qs, then it is also wrong. 


These are all just variations on the same problem. You aren’t correctly telling other people what you’ve put in your variable or what you are doing with the code or data. 


You may have given the variable a name, but it doesn’t match the contents. But you really want to tell other people what data you’ve used the variable to store. You want to describe the contents as precisely as you can, and then embed that description right into the name. 


You know it is a ‘self-describing’ variable name because it doesn’t need a comment, description, or an explanation; that would be redundant. The name says it all. If the variable is a composite, people can look up its underlying variables and get even more enlightened. It's all clear and obvious. No one is ever confused.


Naming is hard. It’s a difficult skill to learn and it’s way harder if you are coding in a second language. That said, it is also one of the most important skills you need to acquire to be able to consider yourself a professional programmer. Anyone can write code, but very few can do it properly.


It’s not enough to just get the code working. There are always bugs and any useful code will evolve. If the work you are doing isn’t just throw-away or wasted time, then it is done very poorly if you’ve explicitly made it harder for it to be supported or enhanced. 


Part of professional programming is building things that last for a long time. That means ensuring that the code lasts by communicating how it works to others. But it’s not just other programmers that you need to worry about. If you’ve picked way too many random crappy names, later, when you end up revisiting the code in a few months you’ll have forgotten what they mean too. 


It's worth noting that your names for technical things should be the correct technical names as used in our industry, and your names for domain issues should match the vocabulary of your users. Mostly variables should be nouns. Functions, end-points, and any other actions should be verbs. Most of the time you don’t need to make up names, but there are often many situations where a bunch of different names are equally applicable. In general, picking the shortest name is usually best. 


Code should not be overly dense; names are one of the key modifiers of density. Well-balanced code makes the readability pop. The meaning is not obscured by useless noise, it comes out clearly.


You can easily test the quality of a name. If it needs a description for people to understand it, then it is a poor name. If it exactly describes itself, then it is not.


There are many bad things to use as names or to embed into them:


  • Jargon (if there is a better, simpler choice)

  • Made-up acronyms

  • Redundant terms, stuttering or stuttering within a larger context

  • Meaningless junk/filler words


Naming takes time. If you are in a rush, the names will reflect that. If you are misfocused on some other aspect of the code, the names will also reflect that. 


If you make bad low-level changes that create or don’t fix naming issues, any code on top is also wrong. You can’t just throw extra data into any old column in a database, for example, you need to make sure the schema is accurate. A few bad names won’t wreck an entire project, but a lot of them will inject enough instability into everything else that it will cause unnecessary grief. 


Your first attempt at naming stuff is probably wrong. That’s okay, and that’s why it is so important to revisit the code, files, configs, etc. over and over again throughout the life of the project. This lets you head off this easily avoidable technical debt before it spirals out of control. Code isn’t just written and then forgotten. It evolves and grows with the project, it needs constant tending. 


Having a bunch of redundant little lumps of code does not avoid this problem, in fact, it usually makes it far worse as it piles up and wastes a lot of time and energy. 


If you have to tackle a sophisticated problem, you’re going to need some really well-thought-out, sophisticated code to accomplish that. There are no easy ways around it; trying to find them just makes it all worse. But it all starts with a bunch of names. If you don’t even know what to call it, you’re trying to code way too early.

Sunday, July 18, 2021

Idempotent

When I was very young, a way more experienced programmer told me that for incoming feeds, basically any ‘imports’, they must always be ‘idempotent’. 

In the sense that he meant it, if you have some code to import a collection of data, say in a csv file, you can feel safe and confident rerunning that import as many times as you need to. At worst, it just wastes some CPU.


That feed will do exactly the same thing each time. Well, almost. The result of running it will mean that the data in the file is now in the database, but it is there uniquely. 


If the feed was run before, that earlier data will be updated or it will be ‘deleted and reinserted’. It will not be duplicated.


The corollary is that if the data changed in the file from an earlier version, it is that later data that will persist now. This provides a secondary feature of being able to use the feed to ensure that the data is always up-to-date and correct without first having to verify that it is not. If in doubt, rerun it.


If you have a big system, with a lot of data loaded over years, this means that if you kept all of the data input files, you could replay all of them to thus ensure that the outcome would be clean data. 


But It also means that if someone had edited the data manually, directly in the database, then replaying the feed would overwrite that change. A behavior that is sometimes good, sometimes bad.


To get around that ambiguity, it’s best to understand that any data coming in from an external source should be viewed as ‘read-only’ data by definition. It’s somebody else’s data, you just have a ‘copy’ of it. If there were ‘edits’ to be made to that data, they should be made externally to the system (the source does it or you might edit the feed files, but not the database itself). 


If it is necessary to annotate this incoming data, those modifications would be held separately in the schema, so that replaying the original data would not wipe them out. That doesn’t mean that it isn’t now inconsistent (the new data nullifies the annotations), but it does ensure that all of the work is preserved.


An important part of getting feeds to be idempotent is being able to understand uniqueness and keys. If the data is poorly keyed, it will inevitably become duplicated. So, making sure that the internal and external keys are handled properly is vital to assuring idempotency.


Oddly, the mathematical definition of idempotent is only that repeated ‘computations’ must result in the same output. Some people seem to be interpreting that as having any subsequent run of the feed be nil. It does nothing and now ignores the data. That’s a kind of a landmine, in that it means that fixing data is now a 2 step process. You have to first delete stuff, then rerun the feed. That tends to choke in that deleting things can be a little trickier than performing updates, they can be hard to find, so it’s often been far more effective to set the code to utilize ‘upserts’ (an insert or a delete depending on if the data is already there) as a means of replying the same computation. Technically, that means the first run is actually slightly different from the following ones. Initially, it inserts the data, afterward, it keeps updating it. So, the code is doing something different on the second attempt, but the same for all of the following attempts...


What I think that veteran programmer meant was not that the ‘code paths’ were always identical, rather it was that the final state of the system was always identical if and only if the state of the inputs were identical. That is, the system converges on the same results with respect to the data, even if the code itself wobbles a little bit.


Since we had that talk, decades ago, that has always been my definition of ‘idempotent’. The hard part that I think disrupts people is that by building things this way you are explicitly giving control of the data to an external system. It changes when they decide it changes. That seems to bother many programmers, it’s their data now, they don’t want someone else to overwrite it. But,  if you accept that what you have is only a read-only copy of someone else’s data, then those objections go away. 

Sunday, July 11, 2021

Time Consuming

For most development projects, most of the time to build them ends up in just two places:


  • Arranging and wiring up widgets for the screens

  • Pulling a reliable window of persistent data into the runtime


The third most expensive place is usually ETL code to import data. Constantly changing “static” reports often falls into the fourth spot.


Systems with deep data (billions of rows, all similar in type) have easier ETL coding issues than systems with broad data (hundreds or even thousands of entities). 


Reporting can often be off-loaded onto some other ‘reporting system’, generally handled outside of the current development process. That works way better when the persistent schema isn’t a mess.


So, if we are looking at a greenfield project, whose ‘brute force’ equivalent is hundreds or thousands of screens sitting on a broad schema, then the ‘final’ work might weigh in as high as 1M lines of code. If you had reasonably good coders averaging 20K per year, that’s like 50 person-years to get to maturity, which is a substantial amount of time and effort. And that doesn’t include all of the dead ends that people will go down, as they meander around building the final stuff.


When you view it that way, it’s pretty obvious that it would be a whole lot smarter to not just default to brute force. Rather, instead of allowing people to spend weeks or months on each screen, you work really hard to bring it down to days or even hours of effort. The same is true in the backend. It’s a big schema, which means tonnes of redundant boilerplate code just to make getting around to specific subsets convenient.  


What if neither of these massive work efforts is actually necessary? You could build something that lays out each screen in a readable text file. You could have a format for describing data in minimalistic terms, that generates both the code and the necessary database configurations. If you could bring that 1M behemoth down to 150K, you could shave down the effort to just 7.5 person-years, a better than 6x reduction. If you offload the ETL and Reporting requirements to other tools, you could possibly reach maturity in 1/6th of the time that anyone else will get there. 


Oddly, the above is not a pipe dream. It’s been done by a lot of products over the decades; it is a more common approach to development than our industry promotes. Sure, it’s harder, but it’s not that much riskier given that starting any new development work is always high risk anyways. And the payoff is massive.


So, why are you manually coding the screens? Why are you manually coding the persistence? Why not spend the time learning about all of the ways people have found over the last 50 years to work smarter in these areas? 


There are lots of notions out there that libraries and frameworks will help with these issues. The problem is that at some point the rubber has to hit the road, that is, all of the degrees of freedom have to be filled in with something specific. When building code for general purposes, the more degrees of freedom you plug up, the more everyone will describe it as ‘opinionated’. So, there is a real incentive to keep as many degrees open as possible, but the final workload is proportional to them. 


The other big problem is that once the code has decided on an architecture, approach, or philosophy, that can’t be easily changed if lots of other people are using the code. It is a huge disruption for them. But it’s an extraordinarily difficult task to pick the correct approach out of a hat, without fully knowing where the whole thing will end up. Nearly impossible really. If you built the code for yourself, and you realized that you were wrong, you could just bite the bullet and fix it everywhere. If it’s a public library, they have more pressure to not fix it than to actually correct any problems. So, the flaws of the initial construction tend to propagate throughout the work, getting worse and worse as time goes by. And there will be flaws unless the authors keep rewriting the same things over and over again. What that implies is that the code you write, if you are willing to learn from it, will improve. The other code you depend upon still grows, but its core has a tendency to get permanently locked up. If it was nearly perfect, then it’s not really a big problem, but if it were rushed into existence by programmers without enough experience, then it has a relatively short life span. Too short to be usable. 


Bad or misused libraries and frameworks can account for an awful lot of gyrations in your code, which can add up quickly and get into the top 5 areas of code bloat. If it doesn’t fit tightly, it will end up costing a lot. If it doesn’t eliminate enough degrees of freedom, then it’s just extra work on top of what you already had to do. Grabbing somebody else’s code seems like it might be faster, but in a lot of cases it ends up eating way more time.  

Sunday, July 4, 2021

Priorities

For me, when I am coding for a big system, my priorities are pretty simple:

  1. Readability

  2. Reusability

  3. Resources


It’s easy to see why this helps to produce better quality for any system.


Readability 


At some point, since the ground is always shifting for all code, either it dies an early death or it will need to be changed. If it’s an unreadable ball of syntactic noise, it’s fate is clear. If the naming convention is odd or blurry, its fate is also clear. If it’s a tangle of unrelated stuff, it is also clear. A big part of the quality of any code is its ability to survive and continue to be useful to lots of people. Coding as a performance art is not the same as building usable stuff. 


Deliberately making it hard to read is ensuring its premature demise. Accidentally making it hard to read isn’t any better.


Cutesy little syntactic tricks may seem great at the time but it is far better to hold to the ‘3am rule’. That is, if they wake you up in the middle of the night for something critical can you actually make sense of the code? You’re half asleep, your brain is fried, and you haven’t looked at this stuff for over a year. They give you a loose description of the behavior, so your code is only good if a) you can get really close to the problem quickly, and b) the reason for the bug is quite obvious in the code. The first issue is architecture, the second is readability. 


Another way of viewing readability is the number of people who won’t struggle to understand what your code is doing. So, if it's just you, then it is 1. It is a lot higher if it’s any other senior programmer out there, and even higher if it’s anyone with a basic coding background. Keep in mind that to really be readable someone needs to understand both the technical mechanics and the domain objectives.


Reusability


If the code is readable, then why retype in the same code, over and over again for 10x or even 100x times? That would be insane. A large part of programming is figuring out how NOT to retype the same code in again. To really build things as fast as possible you have to encounter a problem, find a reasonable working solution, get that into the codebase and move on. If you can’t leverage those previous efforts, you can’t move on up to solving bigger and more interesting problems. You're stuck resolving the same trivial issues forever.


As you build up more and more reusable components, they will get more and more sophisticated. They will solve larger chunks of the problem space. The project then gets more interesting as it ages, not more unstable. It’s an active part of making your own work better as well as making the system do more. Finding and using new libraries or frameworks is the fool's gold equivalent since they almost never fit cleanly into the existing work, it just makes the mess worse. What you need is a tight set of components that all fit together nicely, and that are specific to your systems’ problem domain. 


Detachment: Sometimes in order to get better reusability you have to detach somewhat from the specific problem and solve a whole set of them together. Adding a strong abstraction helps extend the reusability, but it can really hurt the readability (which is why people are afraid of it). Thus to maintain the readability, you have to make sure the abstraction is simple and obvious, and provide relevant comments on how it maps back to the domain. Someone should be able to cleanly understand the abstraction and should be able to infer that it is doing what is expected, even if they don’t quite know the mapping. If the abstraction is correct and consistent, then any fix to its mechanics will change the mapping to be more in line with the real issues.


Resources 


The ancient adage “Premature optimization is the root of all evil” is correct. Yes, it is way better that the code minimizes all of its resource usage including space, time, network, and disk storage. But if it's an unreadable mess or it’s all grossly redundant, who cares? It’s only after you master the other two priorities that you should focus on resources.


It’s also worth pointing out that fixing self-inflicted deoptimizations caused by bad coding habits is not the same thing as optimizing code. In one case bad practices callously waste resources, in the other, you have to work hard to find smarter ways to achieve the same results. 


Summary


If you focus on getting the code readable, it’s not difficult to find ways to encapsulate it into reusable components. Then, after all of that, you can spend some time minimizing resource usage. In that order, each priority flows nicely into the next one, and they don’t conflict with each other. Out of order they interact badly and contribute to the code being a huge mess.

Thursday, June 24, 2021

Managing

The point of management is to take a group of people and get something done.

The difficulty of management is that you can’t just do it yourself. You might know how to do it, you might only have a vague idea, or you might even be entirely clueless. But even if you do know, which is obviously better, you still don’t have the time to go it alone.


The trick to management is to 'enable' other people to do the work for you. The catch is that you don’t want them going off rogue. The tradeoff is that the more you try to control them, the worse they will behave. So, there is this fine touch needed to steer them in the right direction, but then let them go off and do the work they need to. 


You need to treat them like they are the fingers on your own hand. You know what you want to pick up, they each position themselves to make that possible. 


To enable someone you have to put them in the right place, at the right time, with the right tools. Everyone’s different, everyone is unique. You need to draw a wide box for them to operate, but you also need to know when they’ve strayed out of bounds. You can’t crush their confidence, but you don’t want them overloaded with arrogance or hubris either. They’re all different types of chess pieces that you need to fit into the right places to be successful.


You can’t ever push them under a bus, even if they aren’t working out. Everyone else will see that and be affected by it. You need patience, guidance, and it requires a lot of effort.


Some people see management as the road to advancement. Others see it as an inevitable bump away from their past. Either way, a good manager enables the people under them, a bad one is an obstacle to be avoided as much as possible. So, in a real sense, if your people are avoiding you they are sending you a strong message.


Sometimes people think that management is just running around spouting out a lot of orders. That’s the opposite of effective management. It’s just an annoying person causing trouble. The real skill is listening. You have to hear what blocks people, what tires them out, what diminishes their confidence. You have to understand the larger picture, the context playing out from above. You have to know what their boxes are, but you don’t always have to fully understand the specifics of each item in them. When you don’t understand you have no choice but to listen to your people and trust them. If you do understand, and they are rogue, you have to gently guide them back on course. Either way, it isn’t for you to layout each and every detail of their assignments, only those cross-cutting items that absolutely need to be in place in order to ensure that the bigger goals are also met. You coordinate and protect, you don’t interfere and punish. 


If you understand this then people will want to work for you. If you don’t get it then you are on your own, no matter how many people are below you. The choice is yours.

Sunday, June 13, 2021

Friction

In software development, ‘friction’ is any and everything that keeps you from moving swiftly to getting your work ‘fully completed’. It’s worth understanding that the job is not done until it’s out there, being used smoothly by a bunch of people. Coding it is a start. Testing it is better. Releasing it is great, but it’s still not done until it’s been running correctly for long enough that it won’t end up showing up back in the shop needing repairs.

Along the way, there are literally millions of little things that get in your way and slow you down. Things that are unspecified, or awkward, or require you to remember a lot of trivial stuff. The problems can be neglect, or ambiguity, or lack of exposure. It doesn’t matter what it is, or whether other people think that it is normal and acceptable, only that it has resulted in you slowing down a bit.

In some cases, particularly with methodologies and ‘controls’, the friction is actually deliberate. Someone, somewhere, wants you to slow down. Sometimes they have extremely good reasons for this. Sometimes it is just a random accident of the dysfunction in your organization. Most often, there is nothing you can do about the larger systemic friction. You just have to live with it.

But there are lots of smaller, closer to home, areas where you can and should eliminate friction. You should always leave a little bit of extra time, so you can remove one or two small issues a week. If you keep that up over the long haul, things will gradually get a lot better.

If it’s awkward to find the stuff you are looking for, you can invest some time in organizing it, or acquiring better-searching tools, and learning how to use them properly.

If it’s repetitive actions, you can automate them with scripts, or cute little GUI widgets to trigger larger actions.

If it’s volatility coming directly from the domain, you can spend time to learn more about the domain and talk with as many people as possible. In all domains, some issues are intrinsically dynamic. It’s a huge mistake to try and treat these statically.

These things are all in your control. You can spend time improving on them. Even if they are little, they add up quickly.

If the issue is not enough time, you need to learn to add a bit of slack in your estimates. Oddly, if you get ahead of the ball and reduce a lot of friction, you’ll find that you’ve caught up and often have more time than you need. The mistake is getting behind in the game, and then digging the hole deeper and deeper by just trying to ignore all of the little problems around you.

Stop digging, apologize for being late and fix a small issue or two.

Sunday, June 6, 2021

Data Flow

I like to write this post every few years. People generally skim over it, and then ignore what I am saying. So I’ll probably end up writing it a few dozen more times, then give up completely.

There is one major trick to keeping everything sane while building a big system:


Forget about the code. 


Just ignore it. 


The problem isn’t the code, it isn’t ever the code.


It’s the data. 


If the data flows around the organization in a clean organized matter, then the code is just the secondary issue about the small translations needed as it moves about.


That is, if the data is good, then any problems with the code are both obvious and easily fixable.


If the data is bad or incomplete, the entire stability and trustworthiness of the system is busted. Collecting megabytes of useless data is an epic waste of time. There is no foundation for the work.


Also, stepping back and viewing the entire charade as just data flowing around from place to place is extraordinarily simpler than trying to grok millions of lines of code. Mostly, we collect some data from the outside and combine it with data collected from people in the middle of their problems. That’s it. 


You should never be writing code to retroactively repair data. That is if the data doesn’t exist, or it is ambiguous, or it’s stored in a broken format, that is the real problem. Patching that with flaky code is not a real solution. Fixing the data is. 


If you understand how the data needs to be structured to store it properly, and you honor that knowledge in the code as it moves around, then everything else is a thousand times easier. It’s code you need to write to move it here, or there. It’s code you need to write to translate it into another format. It’s code you need to write to combine some of it together to craft derived data. That’s pretty much it.


Then you can spend your creative energies on building up enough sophistication to really help people solve their real problems.

Wednesday, June 2, 2021

What Should Have Happened ...

 The first thing to do is admit that it went wrong.

The next thing is to figure out what should have happened instead.


With the two outcomes, there will be a bunch of points in time where things deviated from the desired path. Find all of them, list all of them out.


In some cases, an earlier deviation invalidates some of the later ones. That is, if it had not initially deviated, then some later deviation would never have occurred and is rendered moot.


That leaves you with a list of ‘critical’ deviations. They are independent, or at least mostly independent enough that they all matter. 


Keep in mind that a bunch of concurrent small problems can interact together to create much larger ones. If one or more of the deviations was a ‘perfect storm’, then you need to list out all of the other contributing littler deviations.


With a final list of significant deviations, you can roughly assign a weight and fixing costs to them. 


An earlier deviation might actually have very little weight, it might just be more of an annoyance. But it still should be documented as ‘contributing’. 


There might be some deviations whose real cost is far too high to realistically fix them. The only solution for those is to put in place controls or mitigations to reduce their effect next time. But they should be noted as expected. 


So now you have:

  • What went wrong.

  • What should have happened instead.

  • A full list of everything significant, including breakdowns of lesser contributions.

  • A list of alternatives and a list of mitigations. Including the approx weights and costs of each.


The only thing left to do is march through the list and fix things.


Wednesday, May 19, 2021

Irrational Mappings

Right there, in the center of all software development projects, is a set of rather odd mappings from our irrational physical world onto a very strict formal computational one. 

These difficult mappings form the core of any system. 


Get them right, and it is easy to package up all of the other aspects of the system around them. Get them wrong and their faults permeate everything else, making it all rather hopeless.


The act of designing and building these mappings is not intrinsically difficult by itself, but there are enough places for variations and trade-offs that it is way too easy to get it completely wrong. 


It's not something that you learn overnight, nor is it easily explainable to others. There are no blog posts to read, nor textbooks to lay it out in any detail. 


The best way to learn it is to work with someone else who understands it, then gradually build up enough experience that you can eventually strike out on your own. Getting it wrong a dozen times doesn't help as there are millions of possible mistakes you can make. It’s only when you see it being done well, that you are able to get some type of sense of how it all comes together. 


You can tell when development projects are led by someone who gets it. The big and small parts all fit together nicely. It’s also pretty obvious when the project is adrift without a strong lead. 


Since this is the blurry intersection between technology and the problem domain, you really have to have spent time on both sides of the fence in order to see how everything lines up. A strong business background tells you about the problem. A strong technical background tells you about the possibilities of the solution. But neither side on their own helps with working through this mapping. You have to connect these two sides together, carefully, but since it's odd, crazy, and irrational, then throughout the process you have to try to balance out any hard edges in order to find some type of reasonable working compromise. 


The strength of any system is the quality of these mappings, which comes directly from the skill level of the people putting them together. It might just be one person, which tends to be more coherent, but if the timescales are too short, a group of people may have to come together to sort it out. 


We have no known defined way of ‘documenting’ this mapping. It underpins all of the other design work, peeking through in artifacts here and there. While you can’t describe it, it is clearly visible when it is missing. 


Since it binds all of the other work together, it is always a huge disruption when the original author leaves and is replaced. The next person might know how to follow suit, maybe slowly changing it, but this only happens if they can ‘see’ the original effort. 


Two or more competing mappings will always clash destructively. They will rub each other in the wrong direction, forming one larger broken mapping. Bad choices will gradually undo any good ones. 


A development project is over when the mapping is complete for the whole problem domain or when the mapping is so hopelessly broken that any effort to fix it exceeds the effort to start over again. In the first case, the system just needs continual ongoing work to move the code forward with the constant changes in dependencies. In the second case, any work going into the system is wasted. The skill level necessary to correct a bad mapping is off the charts. 

Saturday, May 15, 2021

System Inventory

 A list of things that need to exist for any large software development project:

  1. A general write-up of the industry, domain, etc. including terminology and conventions.

  2. A roadmap for the growth and direction of the system.

  3. A set of requirements (by feature) from the user’s perspective.

  4. The user’s workflows as they relate to the domain problems they are encountering.

  5. A model for each of the different data entities that are encountered during the workflows.

  6. A list of real examples and all special cases.

  7. A set of requirements from the operational perspective.

  8. A design grid for all screens.

  9. A color scheme for the GUI.

  10. A high-level navigation structure for the GUI.

  11. An arrangement of data/functionality/layout for any ‘difficult’ screens.

  12. A high-level design/architecture for the system, broken down by major components including frameworks, libraries, and persistence.

  13. A high-level runtime arrangement for processes, machines, etc.

  14. A lot of mid-level designs for the organization of the code.

  15. A code repository with full history, including highlighting the code that was released to the different environments. Also contains all dependencies necessary for building.

  16. A development setup guide, including any tools, configurations, etc. 

  17. A style and conventions guide for making changes or new additions to the code base.

  18. Technical notes on solving complex problems, preferred idioms, references, and other issues.

  19. A searchable inventory of the reusable parts of the code.

  20. Low-level specifications and/or the code.

  21. A set of test suites that match back to the workflows and examples.

  22. A full list of all known bugs (testing, user, or operational), solved or currently pending.

  23. A set of scripts or instructions on how to build and package the code.

  24. A list of expected configuration parameters and flags for testing and production setups.

  25. A set of scripts to safely upgrade the system, including persistent schema changes and rollbacks.

  26. An operational manual on how to handle recurring problems.

  27. A tutorial on how to use the GUI for common tasks.

  28. A reference document for any APIs, plus any technical notes for calling it correctly.


Any missing items will cause grief.

Saturday, May 8, 2021

Redundancies

 If a computer program is useful, at some point it will get complicated.

That complication comes from its intrinsic functionality, but more often it is amplified by redundancies. 


If you want to delay getting lost in complexities for as long as possible, then you want to learn how to eliminate as many redundancies as you can. They occur in the code, but they can also occur in data as well.


The big trick in reducing redundancies is to accept that ‘abstractions' are a good thing. That’s a hard pill for most programmers to swallow, which is ironic given that all code, and the tech stacks built upon code, are rife with abstractions. They are everywhere. 


The first thing to do is to understand what is actually redundant. 


Rather obviously if there are two functions in the code, with different names, but the code is identical they are redundant. You can drop one and point all of its usages to the other one. 


If there are two functions whose inputs are “similar” but the outputs are always the same, that is redundant as well. If most of the inputs are the same, and the outputs are always the same, then at least one of the inputs is not used by the code, which can be removed. If one of the functions is a merge of two other functions, you can separate it, then drop the redundant part.


The more common problem is that the sum of the information contained in the inputs is the same, yet the inputs themselves are different. So, there are at least 2 different ways in the code to decompose some information into a set of variables. It’s that decomposition that is redundant and needs to be corrected first. The base rule here is to always decompose any information to the smallest possible variables, as early as possible. That is offset by utilizing whatever underlying language capabilities exist to bind the variables together as they move throughout the code, for example wrapping a bunch of related variables in an object or a structure. If you fix the redundant variable decompositions, then it’s obvious that the affected functions are redundant and some can be removed.


As you are looking at the control flow in the code you often see the same patterns repeating, like the code goes through a couple of ‘for’ loops, hits an ‘if’, then another ‘for’ loop. If all of the variables used by this control flow pattern are the exact same data-type (with the same constraints) then the control flow itself is redundant. Even if the different flows pass through very different combinations of functions. 


In a lot of programs we often see heavy use of this type of flow redundancies in both the persistent and GUI code. Lots of screens look different to the users, but really just display the same type or structure of data. Lots of synchronization between the running code and the database is nearly identical. There are always ‘some’ differences, but if these can be captured and moved up and out of the mechanics, then lots of code disappears and lots of future development time is saved.


We can see this in the data that is moving around as well. We can shift the name of a column in the database from being a compile-time variable over to being a runtime string. We can shift the type to being generic, then do late binding on the conversion if and only if it is needed somewhere else. With this, all of the tables in the database are just lists of maps in the running code. Yes, it is more expensive in terms of resources, but often that difference is negligible. This shift away from statically coding each and every domain-based variable to handling them all dynamically usually drops a massive amount of code. If it’s done consistently, then it also makes the remaining code super-flexible to changes, again paying huge dividends.


But it’s not just data as it flows through the code. It’s also the representation of the ‘information’ collected as it is modeled and saved persistently. When this is normalized, the redundancies are removed, so there is a lot less data stored on disk. Less data is also moved through the network and code as well. Any derived values can be calculated, later, as needed. Minimizing the footprint of the persisted data is a huge time saver. It also prevents a lot of redundant decomposition bugs as well. In some cases, it is also a significant performance improvement.


This also applies to configuration data. It’s really just the same as any domain data, but sometimes it’s stored in a file instead of in a database. It needs to be normalized too, and it should be decomposed as early and deeply as possible. 


Redundancies also show up in the weirdest of places. 


The code might use two different libraries or frameworks that are similar but not actually the same. That’s code duplication, even if the authors are not part of the project. Getting rid of one is a good choice. Any code or data ‘dependencies’ save time, but are also always just problems waiting to happen. They only make sense if they save enough effort to justify their own expense, throughout the entire life of the project.


Redundancies can occur in documentation as well. You might have the same similar stuff all over the place, in a lot of similar documents, which generally ages really badly, setting the stage for hardships and misunderstandings. 


Processes can be highly redundant as well. You might have to do 12 steps to set up and execute a release. But you do this redundantly, over and over again. That could be scripted into 1 step, thus ensuring that the same process happens reliably, every time. With one script it’s hard to get it wrong, and it’s somewhat documenting the steps that need to be taken.


The only redundancies that are actually helpful are when they are applied to time. So, for example, you might have saved all of the different binaries, for all of the different releases. It’s a type of insurance, just in case something bad happens. You have a ‘record’ of what’s been released with the dates. That can be pruned to save space, but a longer history is always better. This applies to the history in the repo and the documentation history as well. When something bad happens, you can figure out where it started to go wrong, and then do some work to prevent that from reoccurring next time, which usually saves a lot of support effort and helps convince the users that the code really is trustworthy.


In programming, there is never enough time. It’s the one consistent scarce resource that hasn’t changed over the decades. You can take a lot of foolish short-cuts to compensate, but it’s a whole lot better if you just stop doing redundant work all of the time. Once you’ve figured out how to do something, you should also figure out how to leverage that work so you never have to figure it out again.