Wednesday, July 29, 2020

Puzzles

I’ve always enjoyed doing large jigsaw puzzles. You dump out the box of pieces on a big table, turn them all over, find the edges pieces, and then disappear into the problem. For me, it is relaxing and somehow it recharges my batteries. 


Programming has similar qualities but is a little different. It is grabbing a lot of pieces and assembling them into a big picture, but instead of there saying 1000 pieces that we know belong to the final picture, there is almost an endless number of them. In a puzzle each piece has a pre-defined place, in programming, each place has a large number of different options.


In a puzzle, we know what the final picture should be. It’s basically decided in advance, long before it was printed on the cardboard, long before it was cut up. In programming, the big picture is there too, but it has a large degree of flexibility, it’s vague and often people have wildly different images of it. 


In a puzzle, there are many different ways you can tell that you have connected the wrong piece. The knob might not fit tightly, the edges are different sized or the images don’t align. Since the end goal is to assemble the whole puzzle, a misfitting piece is wrong twice. It displaces the piece that should be there, but it also is not placed in its own correct location. If the image is close but subtly-wrong, it can cause quite a delay as it can also interfere with the placement of the pieces around it. 


In programming though, since most people aren’t aware of the big picture, they just put the pieces somewhere that kinda fits. Then they force it in. So, if you could imagine a puzzle with no edges and mostly fungible pieces, people assembling it would be quite fast, but the outcome would be very unpredictable. That’s closer to what happens in coding. Without a big picture, a lot of programmers just pound in the pieces and then stick to the assertion that it was right. When it’s not, then what emerges is a scrambled big picture, that is incredibly difficult to add more pieces too. 


At it’s best, I enjoyed putting together software systems. It has a similar vibe to puzzles. You just focus on finding the right pieces, keep it somewhat organized and eventually the picture or the system emerges from the effort. At its worse though, people are furiously pounding in pieces, arguing about their validity and the whole thing is just a growing mess. 


That kinda gives us a way of thinking about how to fix it. The first point is obviously that the people working on the pieces, need to understand the big picture. You can’t just build a working system by randomly tossing together logic. It’s not going to happen. You need the big picture, in the same way, you need it for a puzzle. It keeps you on track, it prevents mistakes. 


The second point is that forcing a non-fitting piece to connect to other pieces is a bad idea. In most puzzles, you get a sense that the piece you are adding is the ‘right’ piece. It fits, the images line up, it is where it is supposed to be. The same is actually true in programming. There is a very real sense of ‘fit’. The code nicely snaps into place, it is readable and it feels like it is in a place where it belongs. Some people are more sensitive to this than others, but most programmers do get a real sense of misfitting, some are just taught to ignore it for the sake of speed. 


Still, what makes a puzzle fun to me at least, is not that I can do it super fast, but rather that I focus in on it and enjoy the process. The people around me may want it faster, but I have learned to take it at a smooth and reasonable speed.

Monday, July 27, 2020

Defensive Coding: Names and Order

A fundamental rule of programming is that the code should never lie. 


One consequence of that is that a variable name should be correctly descriptive of the data held inside. So, a variable called ‘user’ that holds ‘reporting data’ is lying. Report data is not user data, it is incorrect.


A second consequence is that in the system if most of the variables that are holding data about people interacting with the system are called ‘user’, then pretty much every variable in the system that holds the same data should have the same name. Well, almost. If one part of the system only deals with a subset of the users who have admin privileges, then it’s okay that the variable name is ‘admin_user’. Semantically, they are an admin user, which is a subset of all users. The variable name can be scoped a little more tightly if that increases its readability or intent. Obviously though, if in another part, the user data is passed into a variable called ‘report’ then that code is busted.


In some lower-level parts, we might just take the user data, the reporting data, and the event data, convert them all into a common format, and pass that somewhere else. So, the name of the common format needs to be apropos to the polymorphic converted data that is moving around. It might be the format, like ‘json’, or it could be even more generic like a ‘buffer’. Whatever helps best with the readability and informing other programmers about the content. General names are fine when the code has been generalized.


In most systems, the breadth of data is actually not that wide. There are a few dozen major entities at most. For some people, naming things is considered one of the harder parts about coding, but if variables are named properly for their data, and the system doesn’t have that many varieties of data anyways, and not a lot of new ones coming in, then the naming problem quickly loses its difficulty. If the data already exists and it is named properly, a new variable holding it elsewhere should not have a new name. There is no need for creativity, the name has already been established. So, naming should be a rather infrequent issue, and where readability dictates that the name should be tightened or generalized, those variations are mostly known or easy to derive. 


The other interesting aspect is that there is an intrinsic order to the entities. 


If we were writing a system that delivers reports to users, then the first most important thing we need to know is “who is the user?” That is followed by “what report do they get?” Basically, we are walking down through the context that surrounds the call. A user logs in, then they request a report. 


What that means is that there is a defined order between those two entities. So for any function or method call in the language, if both variables are necessary, they should be specified in that correct order. If there are 3 similar functions, ‘user’ always comes before ‘report’ in their argument lists. 


Otherwise, it is messy if some of the calls are (user, report) and others are (report, user). The backward order is somewhat misleading. Not exactly incorrect, but some pertinent information is lost.


Now if the system has a couple of dozen entities, the context may not be enough to make the order unambiguous, but there is also an order that comes from the domain. Generally, the two, have been enough to cover every permutation.


The corollary to this is that if we are focused on preserving order, but we have code that instead of passing around the entities, is passing around the low-level attributes individually, it becomes clear that we should fix that code. Attributes that are dependent should always travel together, they have no meaning on their own, so there is no good reason to keep them separated. That is, we are missing an object, or a typedef, or whatever other approaches the language has to treat similar variables as a single thing. When that is done, the attributes disappear, and there are fewer entities, and of course less naming issues.


What’s interesting about all of this is that if we set a higher level principle like that ‘code should never lie’, the consequences of that trickled down into a lot of smaller issues. Names and order are related back to keeping the code honest. To be correct about one means being consistent about the other two.

Thursday, July 23, 2020

Defensive Coding: Locality

A property of spaghetti code is that it is highly fragmented and distributed all over the place. So, when you have to go in and see what it is doing, you end up getting lost in a ‘maze of twisty passages’.


Code gets that ugly in two different ways. The first is that the original base code was cloudy. That is, the programmer couldn’t get a clear view of how it should work, so they encoded the parts oddly and then mashed them all together and got them to work, somehow. The second is that the original code might have been nice, tight, and organized, but a long series of successive changes over time, each one just adding a little bit, gradually turned it into spaghetti. Each change seemed innocent enough, but the collective variations from different people, different perspectives, added up into a mess.


The problem with spaghetti is that it is too cognitively demanding to fix correctly. That is, because it is spaghetti there is a higher than normal likelihood that it contains a lot of bugs. In a system that hasn’t been well used, many of these bugs exist but haven’t had been noticed yet. 


When a bug does rear its head, someone needs to go into this code and correct it. If the code is clear and readable, they are more likely to make it better. If the code is a mess, and difficult, they are more likely to make it worse but hide those new problems with their bugfix. That is, buggy code begets more bugs, which starts a cycle.


So, obviously, we don’t want that. 


To see how to avoid this, we have to start at the end and work backward. That is, we have to imagine being the person who now has to fix a deficiency in the code. 


What we really want is for them to not make the code worse, so what we need to do is put all similar aspects of what the code does, in as close proximity to the code as possible. That is, when it is all localized together, it is way easier to read and that means the bug fixer is more likely to do the right thing. 


Most computations, in big systems, naturally happen in layers. That is, there is a high-level objective that the code is trying to achieve to satisfy all or part of a feature. That breaks down into a number of different steps, each one of them is often pretty self-contained. Below that, the lower steps generally manipulate data, which can involve getting it from globals, a cache, persistence, the file system, etc. In this simple description, there are at least three different levels, that have three very different focuses. 


When we are fetching data, for example, we need some way to find it, then we need to apply some transformations on it to make it usable. That, in itself, is self-contained. We can figure out everything that is needed to get ‘usable’ data from the infrastructure, we don’t have to be concerned with ‘why’ we need the data. There is a very clear ‘line’ between the two. 


So, what we want from our layers, is clear lines. Very clear lines. An issue is either 100% on one side, or the other, but not blurred. 


We do this because it translates back to stong debugging qualities. That is, in one of the upper layers, all one needs to do is check to see if the data is usable, or not. If it is, the problem is with the higher-level logic. If not, then it is how we have fetched the data that is the real problem. Not only have we triaged the problem, but we’ve also narrowed down to where the fix may need to go.


There have often been discussions amongst programmers about how layers convolute the code. That is true if the layers are arbitrary. There are just a whole bunch of them, that don’t have a reason to exist. But it is also true that at least some of them are necessary. They provide the organizational lines that help localize similar code and encapsulate it from the rest of the system. They are absolutely necessary to avoid spaghetti.


Getting back to the three-level basic structure discussion, it is important that when there is a problem, a programmer can quickly identify that it is located somewhere under that first higher level. So, we like to create some type of ‘attachment’ between what the user has done in the application and that series of steps that will be executed. That’s fundamentally an architectural decision, but again it should be clear and obvious. 


That means that in the best-case scenario, an incoming programmer can quickly read the different steps in the high-level, compare them back to the bug report and narrow down the problem. When the code is really good, this can be done quickly by just reading it, ponding the effect, and then understanding why the code and the intent deviated. If the code is a bit messier, the programmer might have to step through it a few times, to isolate the problem and correct it. The latter is obviously more time-intensive, so making the code super readable is a big win at this point. 


What happens in a massive system is that one of the steps in a higher-level objective drops down into some technical detail that is itself a higher-level objective for some other functionality. So, the user initiates a data dump of some type, the base information is collected, then the dump begins. Dumping a specific set of data involves some initial mechanics, then the next step is doing the bulk work, followed by whatever closing and cleanup is needed. This lower functionality has its own structure. So there is a top three-layer structure and one of the sub-parts of a step is another embedded three-layer structure. Some fear that that might be confusing. 


It’s a valid concern, but really only significant if the lower structure does something that can corrupt the upper one. This is exactly why we discouraged globals, gotos, and other side-effects. If the sub-part is localized, correctly, then again either it works, or it doesn’t. It doesn’t really matter how many levels are there. 


In debugging, if the outputs coming back from a function call are sane, you don’t need to descend into the code to figure out how it works, you can ignore that embedded structure. And if that structure is reused all over the system, then you get a pretty good feeling that it is better debugged than the higher-level code. You can often utilize those assumptions to speed up the diagnosis.  


So we see that good localization doesn’t just help with debugging and readability, it also plays back into reuse as well. We don’t have to explicitly explore every underlying encapsulation if their only real effect is to produce ‘correct’ output. We don’t need to know how they work, just that the results are as expected. And initially, we can assume the output is correct unless there is evidence in the bug report to the contrary. That then is a property that if the code is good, we can rely on to narrow down the scope to a much smaller set of code. Divide and conquer. 


The converse is also true. If the code is not localized, then we can’t make those types of assumptions, and we pretty much have to visit most of the code that could have been triggered. So, its a much longer, and more onerous job to try and narrow down the problem.


A bigger codebase always takes longer to debug. Given that we are human, and that tests can only catch a percentage of bugs that we were expecting, then unless one does a full proof of correctness directly on the final code, there will always be bugs, in every system. And as it grows larger, the bugs will get worse and take longer to find. Because of that, putting in an effort to reduce that problem, pays off heavily. There are many other ways to help, but localization, as an overall property for the code, is one of the most effective ways. If we avoid spaghetti, then our lives get easier.

Tuesday, July 14, 2020

Normalization Revisited

Over 10 years ago, I wrote 3 rather rambling posts on ideas about normalization:

  1. http://theprogrammersparadox.blogspot.com/2008/11/code-normal-form.html
  2. http://theprogrammersparadox.blogspot.com/2008/10/structure-of-elegance.html
  3. http://theprogrammersparadox.blogspot.com/2008/10/revisiting-structure-of-elegance.html

It’s not a subject that has ever grabbed a lot of attention, but it’s really surprising that it didn’t. 


Most code out there is ‘legacy code’. That is, it was written a while ago, often by someone who is gone now. Most projects aren’t disciplined enough to have consistent standards, and most often any medium-sized or larger system consists of lots of redundant code, a gazillion libraries, endless spaghetti, broken configurations, no real documentation, etc. 


Worse, a lot of code getting written right now depends on this pre-existing older stuff and is already somewhat tainted by it. That is, a crumbling code base always gets worse with time, not better.


These days we use good automated tooling like gofmt (in Golang) or rubocop (in Ruby) for different languages that are capable of enforcing light standards by automatically fixing the syntax, often they are set to reformat during saving in the editor. This lets programmers be a bit sloppy in their coding habits but auto-corrects it before it stays around for any length of time. 


Just putting some of these formatting tools into play in the editors is a big help in enforcing better consistency, getting better readability, and thus better overall quality.


What does that have to do with normalizations? The idea behind normalizing things is that much like in a relational database, there is a small set of rules that control relationship properties. In a database, it is applied to structural issues for the data. In code, it can also be applied to structural issues. 


The execution of code is a serialized list of instructions that each processor in the computer follows, but we often see it as a “tree” of function calls. To do this each function is a node, calling all of the other functions as children. Mostly, it looks like a tree, but since we can have reuse, and there can be infinite loops, it’s really a directed graph. A stack dump then is just one specific ‘path’ through this structure. If we dumped the stack a lot of times and combined it together we’d get a more complete picture.


We can take any of the instructions in this list of execution steps and move them ‘up’ or ‘down’ into other function calls in the graph. We can break up or reassemble different function calls, nodes, so that the overall structure has properties like symmetry, consistent levels, and encapsulated component structure.


It is probably incredibly slow to take a lot of code and rework it into a ‘semantic’ structure, but once that is in place, it can be shifted around based on the required normalizations, then returned to the source language. It might require a lot of disk space and a lot of CPU, but the time and resources it takes are basically irrelevant. Even if it needed a few weeks and 100 Gb, it would still be incredibly useful. 


The idea is that if you take a huge pile of messy code, and set up a reasonable number of ‘non-destructive’ normalized refactors, it would just go off and grind through the work, cleaning up the code. It absolutely needs to be non-destructive so that the final work is trustworthy (or at least as trustworthy as the original code), and it needs to be fully batched, to avoid a lot of user interaction. 


You could literally set it running, go off for a two-week vacation, and come back to a trustworthy, cleaned up codebase, that had exactly 0 operational differences from the mess you had before you left. You could just flip that into production right away, without even having to do regression testing.


A well-set group of rules could clean up variable naming, shift around the layers to unwind spaghetti, put a scope on globals, and even put the basis there for commenting. As well, it could identify redundant code, even if it doesn’t do the merge itself, and it could definitely identify useless data transformations, missing error handling, inconsistent calls, and a huge number of trivial and obvious problems. Basically, it’s a code review on steroids but applied across the entire system, all at once.


It could create menial unit tests, redo the configurations, shift behavior from one type of resource to another (config file -> database, or vice versa). Basically a huge number of super useful cleanup work that programmers hate to do and procrastinate on doing until it causes grief.


To deal with interactivity, suggested changes could be put into comment blocks. So, it generates a better method in the code, comments it out, but with the same name as the replacement method. Much like dealing with SCM merges, you could read both versions, then pick the better one (and add some type of test to vet it properly). 


Once the code is normalized, it takes considerably less cognitive effort to read through it and see the problems. Most of them will be obvious. Starting something, without closing it. Doing weird translations on the data, not checking for errors, and not doing any validations on incoming data. Most bugs fall into these categories. It could also help identify race conditions, locking issues, and general resource usage. For instance, if you normalize the code and find out that it has 4 different objects to cache the same user data, it’s a pretty easy fix to correct that. Bloat is often a direct consequence of obvious redundancies. 


You could look at the resulting code right away, or just leave it to explore on an as-needed basis. Since it’s all consistent and follows the same standards, it will be way easier to extend. 


It would also be useful if it has an idiom-swapping mechanism, for languages where there are too many different ways to accomplish the same thing. If there are 4 ways to doing string fiddling, then swapping the other 3 down to 1 consistent version helps a lot for readability. 


What kills a lot of development projects is that the code eventually becomes so messy that people are afraid to change it. Once that happens, either it goes crazy as an onion architecture (new stuff just wraps around the old stuff with lots of bugs) or it slows down into maintenance mode. Either way, the trajectory of the codebase is headed downwards, often quite rapidly. 


If all it took to get out of that death spiral was to spend a week fiddling with the configuration and then a week of extreme processing, that would make a huge difference. 


Code is encapsulated knowledge about the domain and/or technical problems. That is a lot of knowledge, captured over time, of intermittent quality, and it would be a huge amount of work to go backward, throwing all of it away and starting over. Why? If you can take what was already there, already known, and leverage it, but not as an unreadable opaque box, rather a good codebase that is extendable, then life just got easier. People are somewhat irrational, programmers all have different styles, and as a big ‘stew’ of code, it is overwhelming. If you can just normalize it all into something you want to work on, then most of those historic influences have been mitigated. 

Wednesday, July 8, 2020

Defensive Coding: Direction

A rather big question in software development that typically gets avoided is whether or not the development project is going well. 


That seems like an easy question. There is more code than last year, there are more features, more people are using it, etc. But those types of metrics really don’t capture momentum. They are still somewhat short term. 


For example, you start building a domain-based inventory system, and it all seems great. It’s using a fairly recent tech stack, there are a growing number of users and lots of new features are in development. So, it’s a success, yes? If you could fast forward to 2 years later, you might find that the system has become hopelessly over-complicated, it’s kinda ugly and slow now, the database is full of questionable data, the code is a mess and the original dev team has moved on to greener pastures. What happened? 


We could have looked at the project 2 years earlier and seen the seeds of its destruction. It was there, in the workmanship, the process, and generally the direction. 


Those ongoing little problems gradually become the dominant, fatal issues. They start small but multiply quickly. To see them through the noise requires looking at higher-level properties of the project. 


For instance, it’s not really the amount of code you have, it’s the amount of code that isn’t crappy or misplaced that matters. It’s not how many features you have, but rather the number that is easily accessible and obvious to a user during their normal workflows. It’s not the number of releases you’ve done, or whether you have made it on schedule, but really the operational stability that matters. Looking at these types of metrics gives a better sense of momentum.


We could really build up some serious underlying metrics that are geared towards showing these growing problems, but there is a much easier way to see it. 


You’re working hard for this upcoming release, but after that is done, will the work get easier or harder for the next release? That’s it. That is all there is to it. 


If the ongoing work is getting easier, it’s because you’ve built up and refined better code, processes, knowledge, etc. Then the momentum of the project is positive. 


If each time, it is getting harder, the things that you are trying to ignore, or workaround, or just hope to go away are getting bigger and becoming more of a blocker, then the momentum of the project is negative. If a project suffers from a bunch of negative releases, it has most likely gotten caught in a cycle, and that is very difficult to get out of. 


With that foundation then it isn’t hard to start getting into particular details. What would make some upcoming work easier? What would make it harder? We just want to spend time reducing friction and providing more ability to make the work go easily. 


We can look at a few specific issues. 


First, if there are 4 or 5 programmers, and each one is coding in their own unique style, then unless they are fully siloed from each other, their ability to utilize, fix or incorporate each others work is compromised. Slower. 


The opposite is also true. If there are well-defined standards and conventions that everyone is forced to follow, then moving around the entire codebase isn’t that difficult. There might be some type of domain understanding necessary, but the technical implementations are obvious. 


Following standards is obviously a bit slower, and a bit of ramp-up to learn them, but when traded off against a detached, siloed codebase or a big ball of mega-mud, it is a huge improvement.


The same is true for frameworks and libraries. If everyone has thrown in their own massive set of dependencies, then moving around the code requires epic amounts of learning, which eats time.


Often in big systems, there is a lot of build mechanics and configuration floating around. If that is set up cleanly, then it’s not to hard to absorb it and enhance it. If it is spaghetti, then it just becomes another obstacle.


Abstractions are the double-edge sword of most development. On the one hand, they reduce code often by orders of magnitude, and in doing so they kick up the quality. If they are reused all over, they also cut down on the redundancies and impedance mismatches. On the other hand, they can be weird enough that most other programmers have little hope of understanding them, or their implementations can be impenetrable. If they are nicely documented, with clean decompositions, and encapsulated they are a huge strength and a massive reduction in work, code, time, etc. But they need to be clean and there needs to be a way that most of the current and future team understands them. That has been a growing problem, particularly with the over-reliance on question-and-answer sites for patching code to avoid understand how it works. 


The overall flow of the development matters too. 


Are the ideas for new changes coming from feedback by people who actually use the system, or is it more of a wanton creative exercise based on assumptions? The way the work enters ‘the development pipeline’ usually defines its quality. A weird, non-essential, misplaced feature is just wasting space and time, it’s contribution is negative. 


Once the work is in the pipeline, there are lots of questions that need to be answered, usually with very precise details. Again, if that is getting skipped, it’s substituted with more assumptions or bad facts, so any downstream work is unlikely to be positive. 


There is too, a strong necessity to ensure that the goals are technologically feasible. Adding a feature to search everything doesn’t help if it takes hours to return, or forces a crazy complex replication and caching architecture into being. That might work for Google search, but it out scales most system’s limitations. The costs (money and time) are unrecoverable.


In larger shops, there is usually a need to parallelize the pipelines, so making sure that they don’t knock each other off course means having to have a process around it to control, track and adjust as the priorities and schedules shift around. 


I could continue, adding a lot more, but I think that stepping back and assessing whether or not the endless series of releases are getting easier or harder is such a strong way of being able to focus in on slow brewing problems, that it doesn’t need to be explicit. It’s an easy question to ask, and you can get valid answers directly from the development teams themselves. If things are getting harder, then you mostly need to identify the many reasons why this is happening and start to mitigate them one-by-one in order of contribution. For example, if it is hard to release the code, requires lots of steps that are often forgotten, then coding that as a single script is obviously going to shave off a significant part of the blockage. Getting rid of the blockages gets rid of the friction, and frees up more time to spend on better issues like quality, readability, or even performance. A positive direction for a big project is far better than letting gravity do its job.

Monday, July 6, 2020

Defensive Coding: Cleanup

In one of my first programming jobs, long ago, my boss would make us clean up the office on the last Friday of every month. Even though we were really busy, we’d take the whole day and rearrange stuff, wipe off the dust, go out and buy new furniture if needed, and just focus on making sure our physical office space looked presentable. We weren’t allowed to work on coding or other digital tasks. Even if we were running late for a deadline. 


At the time, I didn’t get it and thought it was eccentric, after all, we were super busy, had lots of technical stuff to do, why would we waste time on the physical environment?


It wasn’t until a bit later when I worked with people who kept procrastinating on any and all cleanup tasks that I began to figure it out. If there is a little bit of clean up work building up over time, and you keep a fairly regular schedule of making sure it is done, it is not a significant problem. But as you leave it for longer, it compounds, until it becomes a rather huge problem. Once it is big, scheduling it is harder, so you procrastinate even more, and eventually find yourself increasingly hampered by it, while gradually losing the ability to fix the issue.


If you watch a master craftsman or painter go about their work, you often see similar habits. They make sure their workspace and tools are perfectly arranged, as the warm-up task for getting the job done. Working in a messy environment interferes with their ability to move fluidly through the task, which ultimately hurts the quality of the work.


In software, it’s not just the physical environment that needs tending to. The setup and maintenance of the tools we use, and the build, test, and release cycle are important as well. If you struggle to run even basic tests on the code, that will eat away at your concentration on the code itself. It’s always worth spending time on these parts of the work to get them smoother, working better. If there is a large team that loses a couple of minutes per day on scratchy, ill-fitting tools, it may seem like it’s a small issue, but when you start piling it up over years, it adds up quickly. And if you account for the intangibles, like that it affects mood and context switching time, then it adds up a lot more rapidly than people realize. 


Even if the physical and development environments are all neat and tidy and kept up to date, there are still plenty of internal code issues that need to clean up as well. The structural organization for a large project consists of a high-level architecture, some mid-level component organization, and a lot of low-level arrangements. If these are missing or disorganized, it becomes harder to know where to put new pieces of code and when that happens, that messiness slows down the rest of the work. The pieces don’t fit, or they are awkward, or that code already exists in multiple other places. Any doubts that come from that eat away at morale and confidence. In that sense, if you know what code to write, but aren’t sure where it belongs then the structure isn’t organized and the codebase is messy, and you are losing valuable time because of it. 


Nobody likes cleaning up. But the longer you avoid it, the worse it becomes. Living in a huge mess is a lot harder than living in a nice and tidy environment. If everything you need to do is Yak Shaving, it is a lot of wear-and-tear on your ability to get things done. Some people are better at ignoring it and just tunnel visioning their focus on tiny parts of the mess at any one time, but as a habit, all that is going to do is ensure that the quality remains low. It’s far better to identify what is blocking you or slowing you down, and to spend a little time each day, week, month, etc. on trying to improve it. 


It’s that you take it as many, many small incremental cleanup tasks, that are fit in as often as necessary, that becomes important. And it is also important to get it into an ongoing, regular habit, instead of trying to make it some sort of special case that you can ignore. 


Not only does it help with the morale and flow of the work, but it also helps with larger issues like estimations. If you dedicate one day a week to cleaning up, and you know that the next task should take 10 days, you also know now that it isn’t going to be done in 2 weeks. It will spill over into the 3rd week, so it is better if you predict that in advance, rather than it turns out to be a surprise later.


Building big systems isn’t a dynamic, exciting profession. It’s a lot of work, most of it is routine, and to be good at it requires patience, discipline and a set of really strong habits to make sure that each time you do work and add to the codebase, it is good enough to not blow up in your face later. If you keep that up, while the world spins wildly around you, you end up building something that is worthwhile and people will use it to make their lives easier. If you get caught up in the whirlwind and lose concentration, then you will wake up one day with a huge pile of unusable code that is causing everyone around you to be pissed off. 

Saturday, July 4, 2020

Opaque Boxes

The idea is to create an opaque box. You can use it to do a few well-specified things, but you need to stay within its tolerances.


The box has some finite set of entry-points. One or more points may allow some dynamic behavior or statefulness as represented by a question and answer paradigm.


If the box can do X, there is a way of testing whether X has been done, or not. 


If the box provides optimized methods, any and all of the information needed for those optimizations is entirely constrained to being in the box and is not available on the outside. If there is a means of toggling the optimization, then there is a way to see that it is set or not.


If the box stores data for later use, the outside format is RAW and it is converted to something else once inside the box. There is a means of getting the RAW data back as it was specified. If there is some other useful internal format, there might be a way of returning that directly, but only if that format itself does not explicitly depend on any of the ‘code’ in the box. So, its more likely that if the box doesn’t implement an external standard, then to get back FORMATTED data requires the box to format it on the way out. 


A simple box equates to the computational model of a Turing Machine (TM). That is, it is just 1 computational engine, from the outside perspective. It takes some input, runs a computation, and produces some output. It is deterministic and will always produce the same output with the same input. 


The communication between the calling program and the box may or may not be asynchronous, but if it is the latter, then there is a means of binding together different entry-points into the same operation (they all happen or they all fail, there is no in-between). 


A non-deterministic box is a simple box where the output can or is guaranteed to change, even though the inputs are consistent. 


A composite box is one that encapsulates the interaction between many TMs. So a request, whether it is read-only or a write, is bound to the behavior of a set of different computational engines, that should be assumed to be asynchronous. If there are multiple TMs, but they are sitting on a synchronous bus, we still shouldn’t consider them as just a single TM. 


A composite box may or may not preserve order. It may or may not protect interdependencies. The onus then is on the user to ensure that a composite box is only used under reasonable circumstances, where independence is either easily provable or where each and every dependence is guaranteed by known properties of the specific box.


A simple box, subject to underlying resource partitioning, will usually return results within a fixed range of time. There is a potential for the halting problem to kick in, so it could possibly never return. On top of this, a composite box may also wait forever on the results of an underlying asynchronous communication, so some of its properties may require it to pause operations and eventually run out of any queuing, so there is a flow problem on top as well. The caller of either box needs to set a bound for the operation and to react appropriately when the box is indisposed. This is different from being unavailable.


A composite box can be built around a collection of other simple and composite boxes. If any of those boxes is asynchronous or non-deterministic, the outer box picks up those properties, unless something has been explicitly done to negate them.


It is possible to frame any one-time or continuous computations in terms of boxes. It is complete. In decomposing the problem this way, the attributes of the boxes have to be explicit and understood so that anything built on top needs to handle the communications with care. While this perspective is somewhat abstract, it is also formal enough that it can be used as the structural guidelines for a larger architecture. That is, we could decompose a very large system into a great number of boxes that could cover 100% of the code. If the boxes work properly and all of the attributes are explicitly handled, then the system works properly.