“Everything is relative in this world, where change alone endures.”
A
 huge problem in software development is to create static, rigid models 
of a world constantly in flux. It’s easy to capture some of the 
relationships, but getting them all correct is an impossible task. 
Often,
 in the rush, people hold the model constant and then overload parts of 
it to handle the change. Those types of hacks usually end badly. Screwed
 up data is computer can often be worse than no data. It can take longer
 to fix the problem then it would to just start over. But of course if 
you do that, all of the history is lost.
One
 way to handle the changing world is to make the meta-relationships 
dynamic. Binding the rules to the data gets pushed upward towards the 
users, they become responsible for enhancing the model. The abstractions
 to do this are complex, and it always takes longer to build than just 
belting out the static connections, but it is often worth adding this 
type of flexibility directly into the system. There are plenty of 
well-known examples such as DSLs, dynamic forms and generic databases. 
Technologies such as NoSQL and ORMs support this direction. Dynamic 
systems (not to be confused with the mathematical ‘dynamic programming’)
 open up the functionality to allow the users to extend it as the world 
turns. Scope creep ceases to be a problem for the developers, it becomes
 standard practice for the users.
Abstracting
 a model to accommodate reality without just letting all of the 
constraints run free is tricky. All data could be stored as unordered 
variable strings for instance, but the total lack of structure renders 
the data useless. There needs to be categorization and relationships to 
add value, but they need to exist at a higher level. The trick I’ve 
found over the years is to start very statically. For all domains there 
are well-known nouns and verbs that just don’t change. These form the 
basic pieces. Structurally as you model these pieces, the same type of 
meta-structures reappear often. We know for example that information can
 be decomposed into relational tables and linked together. We know that 
information can also be decomposed into data-structures (lists, trees, 
graphs, etc) and linked together. A model gets construction on these 
types of primitives, whose associations form patterns. If multiple 
specific models share the same structure, they can usually be combined, 
and with a little careful thought, named properly. Thus all of the 
different types of lists can just one set of lists, all of the trees can
 come together, etc. This lifts up the relationships by structural 
similarity into a considerable smaller set of common relationships. This
 generic set of models can then be tested against the known or expected 
corner-cases to see how flexible it will be. In this practice, ambiguity
 and scope changes just get built directly into the model. They become 
expected.
Often
 when enhancing the dynamic capabilities of a system there are critics 
who complain of over-engineering. Sometimes that is a valid issue, but 
only if the underlying model is undeniably static. There is a difference
 between ‘extreme’ and ‘impossible’ corner-cases, building for 
impossible is a waste of energy. Often times though, the general idea of
 abstraction and dynamic systems just scares people. They have trouble 
‘seeing it’, so they assume it won’t work. From a development point of 
view that’s where encapsulation becomes really important. Abstractions 
need to be tightly wrapped in a black-box. From the outside the boxes 
are as static as any other piece of the system. This opens up the 
development to allow a wide range of people to work on the code, while 
still leveraging a sophisticated dynamic behavior.
I’ve
 often wondered about how abstract a system could go before it’s 
performance was completely degraded. There is a classic tradeoff 
involved. A generic schema in an RDBMS for example will ultimately have 
slower queries than a static 4th NF schema, and a slightly denormalized 
schema will perform even better. Still, in a big system, is losing a 
little bit of performance an acceptable cost for not having to wait for 4
 months for a predictable code change to get done? I’ve always found it 
reasonable. 
But
 it is possible to go way too far and cause massive performance 
problems. Generic relationships wash out the specifics and drive the 
code to being in NP-complete or worse. You can model any and everything 
with a graph, but the time to extract out the specifics is deadly and 
climbs at least exponentially with increases in scale. A fully generic 
model of everything just being a relationship between everything else is
 possible, but rather impractical at the moment. Somewhere down the 
line, some relationships have to be held static in order for the system 
to perform. Less is better, but some are always necessary.
Changing
 relationships between digital symbols mapped back to reality is the 
basis of all software development. These can be modeled with higher 
level primitives and merged together to avoid redundancies and cope with
 expected changes. These models drive the heart of our software systems,
 they are the food for the algorithmic functionality that helps users 
solve their problems. Cracks in these foundations propagate across the 
system and eventually disrupt the user’s ability to complete their 
tasks. From this perspective, a system is only as strong as its models 
of reality. It’s only as flexible as they allow. Compromise these 
relationships and all you get is unmanageable and unnecessary complexity
 that invalidates the usefulness of the system. Get them right and the 
rest is easy.  
 
No comments:
Post a Comment
Thanks for the Feedback!