The Programmer's Paradox: Software Development Workflow

When someone needs a computer to help them, the very first thing that should happen is that all of the specifics of the problem are written down.

A vague idea about the problem with a potential way to solve it is a reasonable place to start, but in order to build something that fully works, eventually, all of the details have to be gathered and it is far more efficient to do that earlier, rather than later.

This analysis includes a definition of the problem, the data it involves, the structure and frequency of the data, any calculations that need to be made, expected user workflows, and all of the interconnections to other systems. There are always lots of moving parts even for simple solutions. These details are then put together in a document. This document is added to the library of previous analysis documents used to build up the whole system.

The analysis documents are the input for design, both technical and graphical.

A system spans a related series of problems. A design needs to list out a matching solution. The architecture needs to be laid out, the interfaces need to be arranged nicely. The interconnections with other systems or components need to be prototyped and tested.

There are a number of overall documents that put caveats, requirements, and restrictions on the final design of a system. Sometimes the pieces are brand new, sometimes they are just modifications to existing works, so there is some effort necessary to go back to the older analysis and designs as well, for many of the components.

The actual technical designs include high, mid, and low-level specifications. The depth of the design is relative to the programmers who will be tasked with building it. Difficult issues or less experienced teams need lower-level detail.

For interfaces, issues like colors, fonts, design grids, screen layouts, navigation, and widget handling all need to be specified. As well, an interface only comes together when it’s individual features are easily interoperable. If they are intentionally silo’ed the program will be awkward or difficult to use.

All of this design work is added to the library of previous designs for the system.

At the coding stage, the programmers now have a good idea of what they are writing, how it fits into the existing system, and what data they need from where. The depth of the specifications should match their background experience. They combine all of these different references to produce the code that will be used for both the system and for any diagnostics/testing that occurs in development or operations.

As they work, they often find missing details or ambiguous specifications. They push these issues back to the originating sources, either design or analysis.

For the base mechanics of their code, they perform very specific tests on issues like obvious data problems, fault handling, logic flow through the permutations, etc. They also perform operational testing on starting stuff up, basic failures, slow resources, big data, etc. Once they feel that the code is sufficiently stable, they submit it to integration testing via the source code repository.

Development and integration testing are done relative to the technical issues within the code. It’s testing to see that the code is operational and that it will perform as the technical specifications required. It does not test the fitness of the solution in solving the user’s problems. That is handled by QA.

After the code is built and integrated into the existing system, it needs to be run through a broader series of tests that relate it back to its usage. The input to these tests is the analysis and design that preceded the code. The things that need to be tested now are the ones that were specified. That is, if the analysis identified a specific workflow, then there should at least be one test that that workflow behaves as expected. So, the user acceptance and correctness testing is driven from the deliverables that existed long ‘before’ the code was written. They should be independent of the programmers, to ensure that any assumptions and bias were not baked into the tests.

Whenever testing is completed, a set of defects is identified. These are prioritized, some need to be corrected before the code can be used operationally. Some are just livable annoyances. This type of testing happens in ‘rounds’ as the problems are found, corrected and a new round is started. Quite obviously, the list of defects should be as close to complete as possible, so that the work involved here is minimized. Testing is most efficient when handled in batches.

At some point, the software still has issues, but everyone can live with them, so it is moved into operations. For some commercial software it is published, for SaaS or in-house code it is put into production. This means that it is monitored by a special group of people to ensure that it is running, that it isn’t hitting resource issues, and that it is satisfying the user’s needs.

The code will have common problems, these are documented, and the reaction/solution to them as they occur is also documented. When operational issues happen, they are tracked, and this tracking is fed back to the developers. If a new, unexpected problem occurs, then the programmers might have to get involved in diagnosing it. If the problem has occurred at least once already, then the programmers should not be involved in investigating it. Operations keep this library of common problems and their mitigations.

It should be noted that a software developer is involved in all five stages, while a programmer is often just working during the coding portion. As the system grows, the library of analysis and designs should grow as well. The code isn’t the only resource that needs to be collected into a repository.

For a large or complicated system, the process is very complex. Since the time it takes to do the work exceeds the time available for all of the work to be done. Development happens in a very large series of interactions. It is often running in various parallel stages, which significantly amplifies the complexity (and threatens the quality). Deficiencies in any part of the work manifest as bugs in the system. This is expected, we cannot build complex systems perfectly, but we can at least ensure that they converge on getting more correct as the project matures.

The work involved in initially setting up a brand new project for the analysis, design, architecture, environment, configurations, etc. is a lot and is often rushed. This causes foundational problems that can really slow down the progress of the development over the years. As such, doing a better job of initially setting the project up has a significant impact on the success of the project over its lifetime. Correcting problems early is essential. Also, it is never safe to assume that the initial setup was fine, it’s better to just double check that it was set up reasonably since most often it was not.

Consistency, in any part of the project, cuts down on the total amount of work that is necessary. If the work is getting harder as the project progresses, it is usually because the process itself is disorganized. Keeping everything organized is a huge time saver. Avoiding redundancy is also a huge time saver. Many projects get caught into a vicious cycle of taking too many short-cuts to save time, only to have the earlier short-cuts waste more time than they saved. Once that is started, it can be very difficult to break the cycle.

Sometimes, for rapid prototyping, the analysis, design, and coding stages will all get intermingled. One coder or a small group of people will do all three at once, in order to expedite the development. That gets the code done really fast, but it obliterates any reasonable ability to extend that code. It should be thrown away or at least heavily refactored after it has served its immediate purpose. Weaknesses in organization, consistency, architecture, etc. can force later work to wrap around the outside. New layers effectively freeze or disable the original code, causing fragmentation and preservation of bugs. That can start another cycle, where the existence of the older code makes it increasingly harder to make any new code work properly, eventually halting the development.

If there is no documentation, or if the documentation was lost, you can reverse engineer the specifications from the code, but any of the driving context is lost. You know what the code is doing, but you don’t know why someone chose to implement it that way. That means the logic is either now locked in stone, or you will have to rediscover what someone else already went through. Neither outcome is good, so building up libraries of analysis, design, and problem mitigations is quite beneficial.

It is crucial to never put untested code into production. Since there is always a need for rapid bug fixing, the shortest time to write code, test it, and then deploy it, for most systems is less than a day. This is predicated on reproducing a bug in the dev or test environment so that the potential change can be shown to work as expected. For normal development work, the time it takes to go from analysis to production is weeks, months, or even years. The two workflows are very different, both are necessary.

When software development is proceeding well, it is a reasonably smooth occupation. When it is degenerating or trapped in a bad cycle, it is very frustrating and stressful. The amount of friction and Yak Shaving in the work is an indicator of the suitability of the process used for the effort. If everything is late, and everybody is angry and tired, then there are serious process problems with the development project, that can and should be fixed.

The Programmer's Paradox

Sunday, November 1, 2020

Software Development Workflow

No comments:

Post a Comment