One
of the great lessons learned -- long ago -- was that making variables
‘global’ in programming code was just asking for trouble. It is of
course, easier to write the code with globals; you just declare
everything global then fiddle with it wherever you want. But that ease
comes at the rather horrendous cost of trying to modify the code later.
Get enough different sections of code playing with the same global and
then suddenly it is very complicated to ascertain what any little
changes to the variable will do to the overall system. So we came to the
conclusion that these ‘side-effects’ were both very expensive and
completely undesirable. A lot of effort went into making it possible in
most modern languages to batten down the scope for everything,
variables, functions, etc. We turned our attention towards stuffing as
much as possible into ‘black boxes’ -- encapsulation -- so that we
minimize its interaction with the rest of the code base.
Another
lesson often learned was that stateless code was considerably easier to
debug than code that supported a lot of internal states. If you execute
the code and each time its behavior is identical, it is fairly
straightforward to determine if it is correct or not. If however, the
behavior fluctuates based on changing internal state information, then
testing becomes a long and drawn out process of cross-referencing all of
the different inputs with the different outputs (a task usually
short-changed leading to corner-case bugs). Test cases become complex
sequences that are both time consuming and hard to accurately reproduce.
Simple tests for stateless code means less work and better quality.
State
changes can come from internal modification of variables, but they are
most often triggered by things external to the scope of the code. Thus,
function A modifies some state information, so that the behavior of
function B changes. Generally the call to function A comes from
somewhere on the outside of function B’s code block. This essentially
forms an indirect reference to the state for function B, which relies
not on a global variable, but rather a function that could be accessed
globally. A global function. When we banished globals we did so for
static variable declarations, however a code-based dynamic call is
essentially the same thing. In a very real sense, any part of the
program that is subject to changes either directly, or indirectly, that
originate from other parts of the program is some type of global. Global
data or global action, it doesn’t matter.
Ideally
to make everything easily testable we’d like 100% of all arguments
explicitly pushed into every function, and to support changes within the
system we’d like 0% side-effects, so everything changed is returned
from the function. Global-less, stateless code.
Often
in APIs there are a large number of different primitives available.
Different users of the API will access these subsets of functions in
many different orders. In most Object Oriented (OO) languages this is
handled by using something equivalent to set/get methods to alter the
state of internal private variables, which other primitives use as
values in their calculations. However, these methods are only available
if the object is within the scope of the caller, so it has the effect of
constraining their usage. So long as the object is not global, the
methods are not either. The object becomes a local variable, interacting
with it can be in any order necessary. However you can violate this
easily by either setting the method calls to static or by creating the
object as a Singleton. Either way introducing a global effect.
Another
way to mess with things is to have the internal data as a reference to
an object that is outside of the scope. When changes to that underlying
object can occur anywhere in the code,this is another form of global
manipulation.
In
most systems, particularly if there is an interface for users, there is
a considerable amount of mandatory state. Basically the computer is
used to remember the user’s actions so that the user doesn’t have to
keep supplying all of the contextual data over and over again. Depending
on how the surrounding session mechanics interacts with the underlying
technology, this can leave a lot of little pieces of required state
laying all over the code. This of course is a form of spaghetti
(variable-spaghetti) and it can be quite nasty because it gets placed
everywhere. Cleaning this up means collecting all of the state
information together into a single location for any given technical
context. So for instance, in a web app there is likely one big
collection of user state information in the browser, and another
collection associated with the user’s session in the server. That’s fine
and considerably better than having a huge number located in both
places.
Long
ago we identified that global variables were a big problem and in many
circles we banished them. But I think we focused too hard on the
‘variable’ part, and not enough on the ‘global’ aspect. Global anything
is a potential problem, anywhere. Like disorganization and redundancies,
it is just a pool of gasoline waiting for a match. Software systems can
be composed of an outrageous amount of complexity, and the only way to
effectively deal with it is by encapsulating as much as possible into
well-organized sub pieces. If you break that encapsulation... well, you
are pretty much right back to where you started ...
Maybe the real question is how to encapsulate what state info must be stored?
ReplyDeleteBTW, I saw on another blog you said you were looking for a business partner. I won't say I am interested (although I am in the same boat myself). If nothing else it might be worth sharing experiences if you'd like.
Hi Chris,
ReplyDeleteYes, I'm always really happy to share and find out about other experiences. I'm at paul underscore homer at yahoo dot ca.
As for you question, I generally pull state up and out, then call it something like 'context'. If you encapsulate it into a black box (and build around it) you run into testing issues. There are exceptions of course, usually relating to communication or shared resources.
Paul.