Sunday, February 15, 2015

Static vs. Dynamic

Possibly the most significant movement in programming has been to avoid 'hardcoding' values. Sizes, limits and text should not be directly encoded into the code, doing so would require the system to be rebuilt and redeployed each time they needed to be changed. That causes long delays in being able to adapt to unexpected changes.

A better approach would be to stick any values into a configuration file, so that they can be easily changed if the need arose. This allows the static values to be managed independently of the code, but could still be somewhat painful on occasion because it requires manual intervention in an operational environment.

The best approach is to make the program dynamic, so that is it no longer needs the values anymore and can adapt to any changes. A simple example is with user preferences. It's easy to lock in a fixed number in the code so each user can change no more than say 10 settings, but 10 is a completely arbitrary number. Computers really like fixed finite data, but users don't. It really isn't all that complicated to allow any user to have as many custom preferences as they desire. Internally one can shift from a fixed size array to a linked list. Most persistent solutions don't have fixed limitations, so there aren't real problems there. Searching could become an issue, but applying proper data structures like a hash table can keep the performance in line with growth.

The fact that the size of the data is now variable may make resource allocation slightly more complex, but realistically if we understand why users have individual preferences then we can estimate an average number they need and along with the expected number of users, we can get a reasonable guess on the overall size. Of course, theses days with disk space so cheap and such tiny data we no longer need to spend time thinking about this type of usage.

Driving things dynamically can be taken to a level far more significant than just avoiding hardcoded values. Dynamic behaviour in code is really internally driven variability. That is, any of the underlying attributes -- both data and code -- can vary based on the current state of the program. Compilers are a good example of this. They can take any large set of instructions in a programming language and use that data to create equivalent machine code that will execute properly on a computer. They can even examine the code macroscopically and 'optimize' parts of it into similar code that executes faster. Internally, they dynamically build up this code based on their input, often creating some form of intermediate representation first, before constructing the final output. Being able to do that gave us the ability to construct programs in languages that were far more convenient than machine code, thus saving a stunning amount of time and allow us to build bigger more complex systems.

Being able to build up complex logic on the fly is a neat trick, but there are still more interesting ways to make code dynamic. In the case of compilers, besides dynamically creating a structure the code itself doesn't actually know what the program it is creating is actually going to do when it runs. This 'hands off' approach to managing the data is an important side-effect of dynamic behavior. As we generalize, the specifics start to fade. The code understands less of what it is manipulating, but making this tradeoff allows it to be more usable. It wouldn't make sense to write a special compiler that can only compile one specific program. That's too much work for too little value.

We can take this need-to-know approach and apply it to other programs. For instance if we have a client/server architecture, there might be hundreds of different unique types of data structures moving back and forth between the different tiers. Explicitly coding each message type is a lot of work, it would be far better to find a dynamic and cost effective way of moving any data about. For this we could utilize a data structure format like JSON that would encode that data into 'containers'. Doing so would allow any communications code to have almost no knowledge of what's inside the container, cutting down on a huge amount of code. Instead of one chunk of code for each structure, we just have a single chunk that handles all of the structures.

We can go farther by using technologies like introspection to dynamically create containers out of any other data-structure in the system, and we could apply a looser typing paradigm at the other end to allow parsing the container back into a generic structure. Creating a structure of explicitly declared variables is common, but we could push those compile-time variables into runtime keys attached to the same values. If the values can also include other containers recursively, then the whole dynamic structure has the full expressibility of any other static data-structure created manually. This not only drives the values dynamically, but also their combined structures as well. The in between code can move and manipulate the data, but may still not know what it is or why it's structured in a particular way.

This type of dynamic behavior can really amplify the functionality of a small amount of code. At some point, but only briefly, the specifics have to come into play. Minimizing the size of that explicit code can save a huge amount of work, allow greater flexibility and actually make the system more robust.

Getting to this level of dynamic code can go right out to the interface as well. Widgets for example don't really care about the data they are handling. They have to validate based on a generic type or domain table or another widget, but other than that it is just some data typed in by a user. If we attached the generic communications data to a dynamic arrangement of widgets, all we need to do is bind by some key, which could easily be from a key/value pair. In that way, we could throw up dynamic forms, filling them with loosely attached dynamic data, and providing some really convenient means of flagging unmatched keys. We could apply the same trick on the backend for popular persistent technologies like key/value databases. Using an ORM and some long skinny tables, we could also persist the data into a relational database. The whole system can be dynamic end-to-end, we could even wired it so that the flows from screen to screen are driven dynamically, thus creating a full dynamic workflow system.

I should mention that the caveat is that there are limits to how dynamic a system can be. If you go too far you get back to nothing more than a general purpose computer. Still, even at that full 'Turing Machine' level we can get value by implementing a domain specific language (DSL) with primitives and syntax that is tailored to allow users to easily construct logic fragments that precisely match their own quickly changing domain problems. The challenge is to create a language specifically for the users to easily understand. That is, it speaks to them as natively as is possible given the underlying formalities. If they feel comfortable reading and writing it then they can craft fragments that bind dynamic behavior to their own specifics. That can be an immensely powerful way to empower the users without having to get bogged down in statically analysing every tiny detail in their domain. You just push the problem back onto the experts by creating tools that are dynamic enough to allow them to fulfil their own needs. What's in between solves the technical problems, but all of the domain ones are driven dynamically.

What's so powerful about dynamic behavior is that if you have good encapsulated dynamic code you can find all sorts of ways to reuse it. We traditionally do this in very limited circumstances with common libraries for each language. Those always have limited understand of the data they are manipulating, which is why they are so usable by so many people. Scaling up these types of abstract coding practices to the rest of the system is the essential ingredient for creating reusable 'lego blocks'. The blocks can be slightly more static than what's in a common library, but so long as they are not completely hardcoded and some thought is given to their many usages, they can be deployed for handing dozens of problems, even ones that are not currently on the foreseeable horizon.

The slow trend in software is in gradually towards making our code more dynamic. It happens regularly in technologies like languages, operating systems and databases, but it can also be applied with great success to other places like applications software. Paradigms like Object Oriented Design were intended to help programmers make more use of dynamic code, but often those goals were not well shared within the programmer communities. Objects as just a means to slice and dice code makes little sense unless you see it as a way to help simplify creating dynamic code. As such, bloated static hardcoded real-world objects are really going against the grain of the original paradigm, which would prefer fully reusabled encapsulated abstract objects as the best way to fully leverage the technology and create better software.

Dynamic code is an exceptionally powerful means to building better software, but in the rush to code this is often forgotten. We should really focus harder on teaching why this is so important to each new generation of programmers. Leveraging existing code is a whole lot better than just rewriting it over and over again. It's a lot faster and safer too.