Monday, August 4, 2014

Mathematics and Software Development

Programming a computer is the act of building up a large number of instructions for a machine to follow based on ‘primitive’ operations and underlying libraries. These instructions or ‘algorithms’ are always computed rigorously, which is occasionally not what we intended. Thus ‘bugs’ may interfere with the user’s objectives, but they will not harm the machine itself (although they can occasionally damage peripherals). The machine is simply following the steps that it was told in a deterministic manner.

Underneath, the computer manipulates ‘data’, which itself has to fit to a predefined format. Data stored in a computer is a symbolic placeholder for things that exist in the real world. It doesn’t really exist, but it can be used to track and analyse the world around us.

As such software can be viewed as a ‘system’, whose set of instructions is ‘formal’. We can’t just create any arbitrary collection of bits and expect it to run, the computer will reject the code or data if it isn’t structured properly.

Mathematics, in its simplest sense, is the study of ‘abstract’ formal systems. Mathematicians create sets of ‘primitives’ that act on abstract mathematical ‘objects’. Collectively, the primitives are used to express relationships between the objects that are rigorous, often these can be combined together to form ‘theorems’ and algorithms. That is, for any mathematics to be valid, it must conform to strict formal rules. Mathematical objects exist only in the abstract sense, although they are often used symbolically to relate back to real things in this world. Doing so allows us to explain or predict the way things are working around us. There are many domains that attempt to model the real world by applying mathematics including statistics, physics, economics and most other sciences.

Both mathematics and computer languages are primarily about formal systems. They both allow us to build up the underlying primitives into larger components such as theorems or libraries. They both exist away from the real world, and their utility comes from mapping them back. The underlying most expressive formal system for computers is the Turing Machine. Within this context we often create other more specific systems like programming languages or applications. Turing Machines also exist within mathematics, however they are not the most expressive formal system that is known. There are larger formal systems that encompass Turing Machines. As such we can see that the formal systems within computers are a subset of those within mathematics.

An interesting difference is that mathematics is completely abstract. It can be written down in a serialized fashion, but it is not otherwise tangible. Software however runs on physical machines, that are derived from the mechanization started by the industrial revolution. Internally software might be abstract, but externally the computers on which it runs are subject to real world issues like electricity, temperature and moisture. In this way software manages to bridge the gap between our abstract thoughts and the real world around us. Software also often interfaces directly with users who are creating new input or trying to analyse what is already known.

Given the relationship, software is a form of applied mathematics. Its formal systems share all of the abstract qualities of mathematical formal systems, and the underlying data is essentially various mathematical objects. Building up software on a computer is similar to building up theorems or algorithms within a branch of mathematics. Both mathematics and computers have issues when getting mapped back to reality, since they are at best ‘approximations’ to the world around them.

Software development has been underway for at least five decades, and as such has built up a large base of existing code, knowledge and libraries. The many partial sub-systems of software, like operating systems or domain specific languages can help to hide its underlying mathematical nature, particularly when crafting graphical user interfaces, but ultimately a strong understanding of mathematics helps considerably in understanding and structuring new parts for systems. There are some aspects of programming that are non-mathematical, such as styling an interface or the content of an image, but for the rest of software development having a good sense of mathematics and logic aid in being able to craft elegant, correct and consistent instructions.

Since software is an applied branch of mathematics, it is clear that not all code is intrinsically mathematics, but realistically what isn’t lies only within the intersection between the machine and the user. That is, whatever irrational or illogical adaptations are needed to make the system more usable for people, is non-mathematical. The rest of the system however, if well-written (and possibly elegant) is a formal system like the many that are known in other branches of mathematics.    

Wednesday, July 30, 2014

Software Development

There are five basic stages of software development:
  • Analysis
  • Design
  • Coding
  • Testing
  • Deployment
These stages are the same whether or not the system is brand new or just undergoing the next round of development.

Analysis is all about gathering the basic facts around the problem and any other related information necessary for the solution. The keys to getting it right are to gather precise details and to organize them for later use. This not only includes the functionality, but also any data involved, environmental restrictions and any other systems to integrate with. This should also include some understanding of why the users need their functionality and how they are doing it currently. There are many ways to organize this type of analysis such as requirements or user stories. The analysis should not only be focused on the user's needs, but also the operational and developmental issues as well. Many projects fail due to missing or sloppy analysis.

Design is the act of taking all of the analysis, along with any existing functionality, and structuring this information in a consistent manner. The higher 'meta-levels' of the design form the architecture. A design can be expressed as diagrams, tables or descriptive text that lays out the various pieces that need to be created. The depth of the design should be contingent on the experience and abilities of the available programmers. With more experienced coders, the design does not need to descend right down to the low-level details. Design is the creative part of software development, but it is best to prototype any questionable design elements first before relying on them. Also, designs for rather standard systems need only be rather standard themselves, and knowledge or research really help with being able to constrain the ideas to something feasible. The key to a good design is to tightly organize the work, while only spelling out the minimum of detail. The development environment has a major affect on the design and architecture, so it needs to be factored into the output. Extensions to the system should utilize the existing code base, rather than just add to technical debt by slapping new feature onto the side. A lack of design always results in a lack of an architecture, which enables disorganization, that eats up unexpected but necessary resources. 

Coding should be the least exciting step in software development. There are lots of little problems to solve, but the work should be dividable and should all come together at the end because of the architecture and any earlier prototypes. Unfortunately, programmers frequently rush into coding too soon, so they get delayed by forgotten analysis and design issues, often called scope creep. Coding can also be the longest step in development, which is why it is so important to leverage 'reuse' effectively to save time wherever possible. Coding sometimes gets caught into a rut, where fixing one problems causes others. Again, this is a primary indicator that the architecture was poorly thought out or does not correctly match the development environment. There is some testing that occurs in the coding stage, specifically unit testing, but it is never a substitute for a full systems test. Starting the coding stage by doing large, non-destructive refactoring is an effective way of controlling technical debt and getting the development set on a stable foundation. A haphazard coding environment leads to fragile code which ends up wasting time in the next two stages. 

Testing is at best probabilistic, in that you can't fully test what is usually an infinite number of possible inputs. It's also the stage that is most frequently reduced due to delays in coding. As such it is both important to focus the testing on finding obvious or embarrassing bugs, but also allowing for patches to be released in the deployment stage. Before testing has occurred, the source code should be frozen so that no accidental untested code leaks out. Most releases should be on the main trunk of development, leaving branches for bug fixes or longer term development. Automated testing is best, but for things like GUIs, manual testing may be more cost effective. A tremendous help is to have an established architecture that allows any chances to be properly scoped so that the tests can be focused only on the parts of the system that have changed recently. Frequent full regression tests should be performed when in doubt, or just occasionally to maintain quality. 

Deployment is the most forgotten part of software development, so that frequently the installation or upgrade instructions are tediously long and over-complicated due to not having enough time available to properly automate them. For commercial systems it is normal to have many different versions all out in active use at the same time, so proper/unique version numbering is crucial to minimizing support. If the design and coding when well, support issues should be minimal, but again there are frequently ignored requirements, so many systems are very fragile to external unexpected events. As well as releases and upgrades, deployment always needs at least one super-fast way of issuing critical patches, to find unexpected bugs.

If the five stages are done as large sequential tasks, the process is considered to be waterfall. This works well for large well-defined projects since it maximizes the efficiencies, but does not support radically shifting specifications or projects that have skipped analysis and design. If the stages are done as a series of short iterations that allows maneuverability, but it is somewhat less efficient. A well running development shop can overlap a number of different iterations in different stages at the same time. That is one group is doing analysis for a new feature, the next is in design, another is coding up the previous design work, while another is testing the next release. The stages overlap nicely, but it takes some experience to keep the whole pipeline running smoothly. Projects should not be high stress, 'hurry up and wait' affairs but rather should just be progressing at a smooth, stable, reasonable pace. Panic development begets bad choices which decrease intelligence and efficiencies. The total length of an iteration is dependent on the work being done. Larger changes need more time to get completed. Scheduling should take this and seasonal issues into account for any parallel iterations.

Most development is routine, and if accepted as such the process of building and extending large systems is enjoyable. High stress environments feed opportunities for short cuts which feed back into the vicious cycle causing poor quality and burnout. Quality is all about how well people attend to all of the little details, so silver bullets and shortcuts come at the cost of heavily reducing the quality of a system. Getting good programmers to work on bad systems is very difficult, so any initial quality problems usually continue throughout the lifetime of the project and shorten its lifespan.

Any process employed to help development should be one that actually helps the developers rather than hinders them. It is easy for a management consultants to draft up a plausible sounding process, but if the focus is not on the actual deliverables for each of the five steps, then it will mostly likely result in extra make-work and take away focus from the main objectives. Since each environment is essentially unique, any process should be built from the existing senior development knowledge already there, so that it is available to help the work rather than hinder it. Also, any sort of deliverables, particularly documentation should be finely scoped to its usage. That is, business documentation should be written to be read by business people and technical documentation should be written to be read by techies. Working backwards from a definitive goal for the deliverables is an effective way to see what has value and what is just make-work.

Poor quality, unfinished work, fragility and other problems are all direct signs that the overall development process has serious problems. Adding arbitrary rules or extra oversight just makes the problems worse. Prolonged bad process kills morale and is very tough to reverse since it deeply effects morale. A broken environment makes the work considerably more expensive and of lower quality. Setting up good engineering practices early makes a huge difference, but also watching for signs of a bad process and correcting it quickly can really help.

Friday, June 27, 2014

Technology

I'll start by proposing a significantly wider definition for the word 'technology'. 

To me it is absolutely 'any' and 'all' things that we use to manipulate our surrounding environment. Under this rather broad definition it would include such age old technologies as fire, clothes and shelter. 

I like this definition because it helps lay out a long trajectory for how technologies have shaped our world, and since many of our technologies are so firmly established -- like fire or clothing -- it really frames our perspective on their eventual impact.

My view is that technologies are neither good nor bad, they just are. It's what we choose to do with them that matters. 

Fire, for instance, is great when it is contained; we can use it for light, warmth or cooking food. It is dangerous when it is burning down houses, forests or whatever else is in its path. We've long since developed a respect for it,  we have an understanding of its dangers, and so we react reasonably whenever its destructive side emerges. 

You wouldn't try to ban fire, or declare that it is bad for our societies. People don't protest against it, and to my knowledge pretty much every living human being utilizes it in some way. It has been around long enough that we no longer react to it directly, but rather to the circumstances in which it appears.

This holds true for any technology, whether it be fire, clothing, machines, radio or computers. 

Upon emergence, people take strange positions on the 'goodness' or 'badness' of the new technology, but as time progresses most integrate it ubiquitously into our lives. Specific usages of the technology might still be up for debate, but the technology itself becomes mainstream. 

Still, new technologies have a significant impact.

Marshall McLuhan seemed to take a real dislike to TV, particullarly as it displaced the dominance of radios. His famous tag line 'the medium is the message' was once explained to me as capturing how the creation of any new technology inevitably transforms us. 

That certainly rings true for technologies like lightbulbs, radio an TVs. Their initial existence broadened our abilities. 

Lightbulbs made us free from the tyrany of daylight. Radios personalized information dissemination well beyond the limits of newsprint and pamphlets. And TVs dumped it all down for the masses into a form of endless entertainment. 

Each came with great advantages, but also with significant dark sides. By now we've absorbed much of the impact of both sides, such that there are fewer and fewer adverse reactions. Some people choose to live without any of these technologies -- some still don't have access -- but they are few.

Technology acquisition seems to have been fairly slow until the early 19th Century spawned the industrial revolution, with its nearly endless series of clever time-saving machines. 

We amplified theses wonders in the 20th Century to create mass production factories and then added what seems like a huge new range of addition technologies: computers, networks and cell phones. 

As these new inventions have swept through our societies, they too have shown their good and bad sides. 

The Internet as a technology went were previous information communications technology could never go, but also has its own massive dark underbelly. A place were danger lurks. Computers have overturned many of the tedious jobs created in the industrial revolution, but replaced their physical aspects with intellectual ones. Cell phones broke the chains on where we could access computers, but chained the users back to an almost mindless subservience to their constant neediness. 

None of these things are bad, but than neither are they good. They just are part of our slow assimilation of technologies over the ages. 

To many it may seem like we are in a combinatorical explosion of new technologies, but really I don't think that is the case. Well, not directly. 

Somewhere I remember reading that it takes about twenty years for a technology to go from idea to adoption. That jives with what I've seen so far and it also makes sense in that that period is also roughly about 'generation'. 

One generation pushes the existing limits, but it takes a whole new one to really embrace something new. Collectively, we are slow to change.

If this adoption premise is valid, then the pace for inventions is basically independent from our current level of progress. It remains constant.

What I think has changed, particullary since the start of the industrial revolution, is the sheer number of people available to pursue new inventions. 

Changes to our ability to create machines enhanced our ability to produce more food, which in turn swelled our populations. Given that weapons like nukes dampened the nature of conflicts around the globe, we are experiencing the largest ever population for our species in any time in history (that we are aware of). 

Technology spawned this growth and as a result it freed up a larger segment of the population to pursue the quest for new technologies. It's a cycle, but likely not a sustainable one.

It's not that -- as I imagined when I was younger -- we are approaching the far reaches of understandable knowledge. We are far from that. We don't know nearly as much as we think we do and that extends right down to the core of what we know. 

Our current scientific approach helps refine what we learn, but we built it on rather shaky foundations. There is an obvious great deal of stuff to learn for practically every discipline out there and there is just a tonne of stuff that we kinda know that needs to be cleaned up and simplified. 

Healthcare, software, economics, weather, management; these are all things that we do optimistically, but the results are not nearly as predictable as we would like, or people claim. On those fronts our current suite of technologies certainly has a huge distance left to go. 

Each new little rung of better predictability -- better quality -- represents at least an exponential explosion of work and knowledge acquisition. For any technology, it takes a massively long time to stabilize and really integrate it into our civilizations. 

Controlling fire was exotic at one point, but now it is no longer so magical. Gradually we collectively absorbed the ability to get reliable usage from it and lessoned its negative side, or as in the case of firemen, at least we built up a better understand of how to deal with any problems rapidly.

For each new technology, such as software, it is a long road for us to travel before we achieve mastery. It will take generations of learning, experience and practice, before these technologies will simply become lost in the surroundings. They'll no longer be new, but we'll find better ways to leverage them for good, while minimizing the bad. This is the standard trajectory for all technologies dating right back to the first one -- which was probably just a stick used to poke stuff.

With this broader definition of technologies, because it extends so far back, it is somewhat easier to project forwards. 

If we have been gradually acquiring new technologies to allow us to manipulate our environment, it is likely that we have been chasing low hanging fruit. That is, we have been inventing technologies and integrating them roughly in the order that they were needed. 

Shelter might have been first, followed by fire then perhaps clothing. Maybe not, but it would not be unreasonable to assume that people tended to put their energies into their most significant problems at the moment; we do not generally have really good long-term vision, particullarly for things that go beyond our own life times. 

With that in mind, whether or not you believe in global warming, it has become rather obvious that our planet is not the nice, consistent stable environment that we used to dream that it was. 
It's rather volatile and possible easily influenced by the life forms trapsing all over it. 

That of course shows that the next major technological trend is probably going to be related to our controlling the planet's environment in the same way that clothing and shelter helped us deal with the fickle weather. 

To continue our progress, we'll need to make sure that the continual ice ages and heat waves don't throw us drastically off course. Any ability we gain that can help there is a technology by my earlier definition. 

As well, the space available on our planet is finite. 

Navigating outer space seemed easy in the science fiction world of last century, but in practice it does appear to be well beyond our current technological sophistication. 

We don't even have a clue how to create the base technologies like warp drives or anti-gravity, let alone keep a huge whack of complicated stuff like a space shuttle running reliably. 

We're talking a lot about space exploration but our current progress is more akin to our ancestors shoving out logs into the ocean to see if they float. It's a long way from there to their later mastery of crafting sailing ships and another massive leap to our state-of-the-art cruise liners. Between all of those is obviously a huge gulf, and one that we need to fill with many new technologies, great and small.

Given our short life spans, we have a tendency to put on blinders and look at the progress of the world across just a few decades. That incredibly tiny time horizon doesn't really do a fair job in laying our the importance, or lack of importance, in what is happening with us as a species. 

We're on a long-term trajectory to somewhere unknown, but we certainly have been acquiring lots of sporatic knowledge about where we have come from. 

Of course it will take generations to peice it all together and further generations to consolidate it into something rational, but we in our time period at least get to see the essence of where we have been and where we need to go. 

Our vehicles for getting there are the technologies that we have been acquiring over millennia. They are far from complete, far from well understood, but we should have faith that they form the core of our intellectual progress. 

They map out the many paths we have been taking. 

Technology is the manifestation of us applying our intellect, which is the current course set by evolution. It tried big and powerful, but failing that it is now trying 'dynamic'; an ability to adapt to one's surrounding much faster than gradual mutations ever could. 

Thursday, June 19, 2014

Recycling

I was chatting with a friend the other day. We're both babysitting large systems (>350,000 lines) that have been developed by many many programmers over years. Large, disorganized mobs of programmers tend towards creating rather sporadic messes, with each new contributor going further off in their own unique direction as the choas ensues. As such, debugging even simple problems in that sort of wreckage is not unlike trying to make sense of a novel where everyone wrote their own paragraphs, in their own unique voice, with different tenses and with different character names, and now all the paragraphs are hopelessly intertwined into what is supposedly a story. Basically flipping between the different many approaches is intensely headache inducing even if they are somehow related.

Way back in late 2008, I did some writing about the idea of normalizing code in the same way that we normalize relational database schemas:


There are a few other discussions and papers out there, but the idea was never popular. That's strange given that being able to normalize a decrepit code base would be a huge boon to people like my friend and myself that have found ourselves stuck with somebody else's short sightedness.

What we could really use to make our lives better is a way to feed the whole source clump into an engine that will non-destructively clean it up in a consistent manner. It doesn't really matter how long it takes, it could be a week or even a month, just so long as in the end, the behaviour hasn't gotten worse and the code is now well-organized. It doesn't even matter anymore how much disk space it uses, just that it gets refactored nicely. Not just the style and formatting, but also the structure and perhaps the naming as well. If one could create a high-level intermediate representation around the statements, branches and loops, in the same way that symbolic algebra calculators like Maple and Mathematica manipulate mathematics, then it would just be straight forward processing to push and pull the lines matching any normalizing or simplification rule. 

Picking between the many names for variables holding the same type or instance of data would require stopping for human intervention, but that interactive phase would be far less time consuming than refactoring by hand or even with the current tool set that is available in most IDEs. And a reasonable structural representation would allow identifying not only duplicate code, but also code that was structurally similar yet contained a few different hard-coded parameters. That second case opens the door to automated generalization, which given most code out there, would be a huge boost in drastically reducing the code size. 

One could even apply meta-compiler type ideas to use the whole infrastructure to convert easily between languages. The code to intermediate representation could be split away from the representation to code part. That second half could be supplied with any number of processing rules and modern style guides so that most programmers who follow well-known styles could easily work on the revised anchient code base. 

Of course another benefit is that once the code was cleaned up, many bugs would become obvious. Non-symetric resource handling for instance, is a good example. If the code grabbed a resource but never released it, that might have been previously buried in speghetti, but once normalized it would be a glaring flaw. Threading problems would also be brought quickly to the surface.

This of course leads to the idea of code recycling. Why belt out new bug-riddled code, when this type of technology would allow us to reclaim past efforts without the drudery of having to unravel their mysteries? 

A smart normalizer might even leverage the structural understanding to effectively apply higher level concepts like design patterns. That's possible in that functions, methods, objects, etc. are in their essence just ways to slice and dice the endless series of instructions that we need to supply. With structure, we can shift the likely DAG-based representations around, changing where and how we insert those meta-level markers. We could even extract large computations buried with global variables into self-standing stateless engines. Just that capability alone would turbo charge many large projects.

With enough computing time -- and we have that -- we could even compute all of the 'data paths' through the code that would show how basically the same underlying data is broken apart, copied and recombined many times over, which is always an easy and early target when trying to optimize code. Once the intermediate representation is known and understood, the possibilities are endless.

There are at least trillions of lines of code out there, much of which has been decently vetted for usability. That stuff rusts constantly and our industry has shied away from learning to really utilize it. Instead, year after year, decade after decade, each new wave of programmers happily rewrites what has already been done hundreds of times before. Sure it's easier, and it's fun to solve the simple problems, but we're really not making any real progress by working this way. Computers are stupid, but they are patient and quite powerful, so it seems rather shortsighted for us to not be trying to leverage this for our own development work in the same way that we try to do it for our users. Code bases don't have to be stupid and ugly anymore, we have the resources now to change that, all we need to do is just put together what we know into a working set of tools. It's probably about time that we stop solving CS 101 problems and moved on to the more interesting stuff.

Sunday, June 1, 2014

Requirements

The idea of specifying programs by defining a set of requirements goes way way back. I saw a reference from 1979, but it is probably a lot earlier. Requirements are the output from the analysis of a problem. They outline the boundaries of the solution. I've seen many different variations on them, from very formal to quite relaxed.

Most people focus on direct stakeholder requirements; those from the users, their management and the people paying the bill for the project. These may frame the usage of the system appropriately, but if taken as a full specification they can lead to rampant technological debt. The reason for this is that there are also implicit operational and development requirements that although unsaid, are necessary to maintain a stable and usable system. You can take shortcuts to avoid the work initially, but it always comes back to haunt the the project.

For this post I'll list out some of the general requirements that I know, and the importance of them. These don't really change from project to project, or for domain. I'll write them in rather absolute language, but some of these requirements are what I would call the gold or platinem versions, that is they are above the least-acceptable bar.

Operational Requirements

Software must be easily installable. If there is an existing version, then it must be upgradeable in a way that retains the existing data and configuration. The installation or upgrade should consist of a reasonably small number of very simple steps and any information that the install needs that can be obtained from the environment should automatically be filled in. In-house systems might choose to not to do a fresh install, but if they do, then they need another mechanism for setting up and synchronizing test versions. The installation should be repeatable and any upgrade should have a way to rollback to the earlier version. Most installs should support having multiple instances/version on the same machine, since that helps with deployments, demos, etc..

Having the ability to easily install or upgrade provides operations and the developers with ability to deal with problems and testing issues. If there is a huge amount of tech debt standing in the way, the limits force people into further destructive short-cuts. That is they don't set things up properly, so they get caught by surprise when the testing fails to catch what should have been an obvious problem. Since full installations occur so infrequently, people feel this is a great place to save time.

Software should handle all real world behaviours, it should not assume a perfect world. Internally it should expect and handle every type of error that can be generated from any sub-components or shared resources. If it is possible for the error to be generated, then there should be some consideration on how to handle that error properly. Across the system, error handling should be consistent, that is if one part of the system handles the error in a specific way, all parts should handle it in the same way. If there is a problem that affects the users, it should be possible for them to work around the issue, so if some specific functionality is unavailable, the whole system shouldn't be down as well. For any and all shared resources, the system should use the minimal amount and that usage should be understood and monitored/profiled before the software is considered releasable. That includes both memory and CPU usage, but it also exists for resources like databases, network communications and file systems. Growth should be understood and it should be in line with reasonable operational growth. 

Many systems out there work fine when everything is perfect, but are down right ornery when there are any bumps, even if they are minor. Getting the code to work on a good day is only a small part of writing a system. Getting it to fail gracefully is much harder, but often ignored as a invalid shortcut.

When a problem with software does occur, there should be sufficient information generated to properly diagnose the problem and zero in on a small part of the code base. If the problem is ongoing, there should not be too much information generated, it should not eat up the file system or hurt the network. It should just provide a reasonable amount of information initially and then advise on the ongoing state of the problem at reasonable intervals. Once the problem has been corrected it should be automatic or at least easy to get back to normal functionality. It should not require any significant effort.

Quirky systems often log excessively, which defeats the purpose of having a log since no one can monitor it properly. Logs are just another interface for the operations personal and programmers so they should be treated nicely. Some systems require extensive fiddling after an error to reset poorly written internal states. None of this is necessary and just adds an extra set of problems after the first one. It is important not to over-complicate the operational aspects by not addressing their existence or frequency.

If there is some major, unexpected problems then there should be a defined way of getting back to full functionality. A complete reboot of the machine should always work properly, there may also be faster, less sever options such as just restarting specific processes. Any of these corrective actions should be simple, well-tested and trustworthy, since they may be choosen in less than ideal circumstances. They should not make the problems worse, even if they do not fix them outright. As well, it should be possible to easily change any configuration and then do a full reboot to insure that that configuration is utilized. There may be less sever options, but again there should always be one big, single, easy route to getting everything back to a working state.

It is amazing how fiddly and difficult many of the systems out there are right now. Either correcting them wasn't considered or the approach was not focused. In the middle of a problem, there should always be a reliable hail mary pass that is tested and ready to be employed. If it is done early and tested occasionally, it is always there to provide operational confidence. Nothing is worse than a major problem being unintentionally followed by a long series of minor ones.

Development Requirements

Source code control systems have matured to the point where they are mandatory for any professional software development. They can be centralized or distributed, but they all provide strong tracking and organizational features such that they can be used to diagnose programming or procedural problems at very low cost. All independent components of a software system must also have a unique version number for every instance that has been released. The number should be easily identifiable at runtime and should be included with any and all diagnostic information.

When things are going well, source repos don't take much extra effort to use properly. When things are broken they are invaluable at pinpointing the source of the problems. They also work as implicit documentation and can help with understanding historic decisions. It would be crazy to build anything non-trivial without using one.

A software system needs to be organized in a manner that encapsulates its sub-parts into pieces that can be used to control the scope for testing and changes. All of the related code, configurations and static data must be placed together in a specific location that is consistent with the organization of the rest of the system. Changes to the system are scoped and the minimal number of tests is completed to verify their correctness. Consistency is required to allow programmers the ability to infer system wide    behavior from specific sub-sections of code.

The constant rate of change for hardware and software is sufficiently fast enough that any existing system that is no longer in active development starts to 'rust'. That is, after some number of years it has slipped so far behind its underlying technologies that it becomes nearly impossible to upgrade. As such, any system in active operation also needs to maintain some development effort as well. It doesn't have to be major extensions, but it always necessary to keep moving forward on the versions. Because of this it is important that a system be well-organized so that at very least, any changes to a sub-part of a system can be completed with the confidence that it won't affect the whole. This effectly encapsulates the related complexities away from the rest of the code. This allows any change to be correctly scoped so that the whole system does not need a full regression test. In this manner, it minimizes any ongoing work to a sub-part of the system. The core of a software architecture is this necessary system-wide organization. Beyond just rusting, most systems get built under sever enough time constraits that it takes a very long time and a large number of releases before the full breath of functionality has been implemented. This means that there are ongoing efforts to extend the functionality. Having a solid architecture reduces the amount of work required to extend the system and provides the means to limit the amount of testing necessary to validate the changes.

Finally

There are lots more implicit non-user requirements, but these are the ones that I commonly see violated on a regular basis. With continuously decreasing time expectations, it is understandable why so many people are looking for shortcuts, but these never come for free so there are always consequences that appear later. If these implicit requirements are correctly accounted for in the specifications of the system, the technical debt is contained so that the operations and ongoing development of the system is minimized. If these requirements are ignored, ever-increasing hurdles get introduced which compromise the ability to correctly mange the system and make it significantly harder to correct later.