Monday, April 13, 2020

Bullet Specs

Implementing the wrong code wastes everyone's time. A good specification that prevents this makes a real difference in keeping a software development project from derailing.

For software, time is a scarce resource. So, it is inordinately more effective to work out any design issues long before coding. If you try to fix a mess later, in the code, it won’t happen and everything built on top will be compromised. Stack a mess high enough and it becomes usable.

So, a sane project that is expected to build non-trivial software needs specifications to avoid wasting its resources.

Over the decades, I’ve used all sorts of methodologies and formats for producing specifications, but they have either been: a) too big and bloated or b) too vague. One wastes precious time, while the other isn’t precise enough to dredge up any critical problems before they become destructive.

Lately, I’ve been using what I call ‘bullet specs’. It’s a compressed variant on what some people might call ‘manage-by-fact’. It’s far from perfect, but it is fast enough to meet time expectations, while still providing enough detail to keep the coding work from going bad.

The idea is that you specify the smallest set of bullets that need to be satisfied. The bullets need to be short, concise and contain only factual statements. If it’s important, then it is a bullet. If it isn’t listed, it can be whatever, it’s a free variable.

You belt out only, exactly, what you know has to be, for the code to be accepted in production. Nothing more, nothing extraneous, no opinions, whys, speculations, or anything else that is not a factual bullet. Just the necessary facts.

That works quite well, but it still needs more in a lot of cases. Bullets are not quite expressive enough to communicate all of the issues.

I tend to think of architecture as a means of organization, that draws hard ‘lines’ separating different sets of code for the system. The best format for lines is a simple diagram, without a lot of fiddly bits. E.g. draw two boxes, then specific code goes into one, or the other, but not in both. If the code is in the wrong box, it needs to be moved.

So, if we were interested in laying a fast spec for a big system, it would consist of a few top-down diagrams that chop up the mechanics and a bunch of bullet specs that tighten down each part. It’s not onerous to read and it is precise enough to not be arguable. If it is followed, the results are predictable.

That’s fairly inexpensive, but it can still go wrong, usually due to convoluted business logic. So, the third piece that is sometimes needed is a data model. For those, I find that a classic ER diagrams fit well to most domains, but I sometimes augment that with the entities being data-structures themselves as I discuss in this post: http://theprogrammersparadox.blogspot.com/2017/09/data-modeling.html

Since we’ve iterated out the underlying data, its types and its structure, and we’ve organized the code into explicit places, the contents of the bullets fill in all of the remaining facts.

A full specification is complete enough that the ‘coding’ can be fairly thoughtless, but again to save time we might not want to invest that amount of effort upfront, or the work is better done by people closer to the metal. Either way, we can look at specifications in three levels: high, medium and low.

If the spec is high level, it will list out required properties or metrics that need to happen as a system and the overall runtime structure. It might cut the system up into a set of libraries. It will list out the major technologies used. The focus is on solving problems for people.

At a medium level it might list out an API and the incoming and outgoing data types, or it could specify the major features to be used from an underlying library or framework. It may layout parts of some of the screen or the parameters for a CLI. The focus is on sets of usable features.

At a lower level, it might list out all of the attributes like locking, formulas, computations, user options or particular algorithms or even the variable names. The focus is on ensuring strong implementations.

Collectively, if all three levels did exist they would cover all of the ‘major details’ that could be found in the code, there would be no wiggle room. Most often though, the specs can save time by leaving some of the details up to the individual programmer or to the agreed-upon architecture, styles or conventions for the project. If part of the system exists in a particular way, the spec can assume that the programmers will follow suit.

What’s specified is only those details that can’t be gotten wrong, where for high-level reasons there is no wiggle room.

Now some people might suggest that it is impossible with any new code to know what has no wiggle room before actually writing it. That’s only actually true for inexperienced coders or pure research projects. For everything else, a specification would reduce the risk of making a mess, which would reduce the amount of time spent on the code, which would help the project achieve its goals. More importantly, it would allow senior developers to lay down fundamentals for less experienced coders that would prevent screw-ups from being caught way late, like in code reviews, testing, or production. Why wait to the end to find out it’s wrong, why let it go that far?

If a team doesn’t have anyone that can produce any type of specs for the upcoming development work, then that problem should be addressed before continuing. Big projects need strong technical leads. It’s fun to try and stretch our abilities in building complex stuff, but there is a ‘bridge too far’ that usually ends in tears. We should do a better job of not starting projects that are impossible without the necessary prerequisites. That would save us from a lot of coding disasters.

No comments:

Post a Comment

Thanks for the Feedback!