Sunday, September 30, 2012


For the purposes of this post I’m going to over-simplify things considerably by saying that all of the stuff that people learn get stored in their brains as a series of ‘models’. A model in this case is essentially a large set of ‘symbols’ and a whole lot of ‘interconnections’ (relationships) between them. Structurally we could view this as a graph (from graph theory), or we could see it as a relative multidimensional mesh of some sort. It doesn’t really matter for this post, either perspective will do.

As we progress through life we collect facts and relationships which help to build up these models. Sometime, we cross-link them to each other forming larger models, although there is no requirement to do so. We can have many models that are essentially independent from each other.

So what we have in our minds is heavily interlinked multidimensional data that is used to drive our actions and behaviors. Often we want to communicate parts of this to the others around us. In order to accomplish this, we pick a starting place in a model and then traverse through the nodes in some complex fashion. We take these paths and we ‘serialize’ them into a stream of sentences which we speak out loud. No doubt the whole process is significantly more complex than what I am describing, but the important aspect is that we are linearizing our vast internal models down to a string of ‘symbols’ that we then communicate to others, who as they listen hopefully update their own internal models with what we’ve sent.

The fact that we are serializing paths through our models has a huge number of interesting consequences. An obvious one is that there can be many many different paths through a model. We see this quite easily because often different people will find very different ways to express the same thoughts. Often too we can see from the paths, that two people may have similar models, but they differ in parts. It isn’t always obvious whether the differences come from the model or from the serialization.

Another consequence is that people listening to the stream (or reading the stream) don’t always update their model or they possibly update it incorrectly (due to lack of attention, ambiguities or other interference). And of course, we do hold multiple conflicting models in our heads precisely because we are seen to so frequently have contradictions in our own understanding. Things we’ve learned in one context can contradict things we have learned in another. If we become aware of that, we could possible intertwined or merge the models, but we not always aware of these contradictions.

The serialization output itself is interesting because we have found many different ways of representing aspects of these models. The most obvious is that we use different verbal languages. Within languages we also break out into different subsets of terminology that are specific to a domain, such as law, computers, medicine, etc. As well, we have learned to serialize aspects in more fundamental formats such as music, painting or poetry. These ‘rawer’ forms generally communicate deeper aspects of our models that are oriented around our underlying emotions. We also have more rigorous representations such as mathematical notation, which we’ve broken down into subsets we call ‘branches of mathematics’ (each often associated with one or more formal systems). People commonly see these types of ‘formal’ serializations as the ends themselves, but really they are just linear representations of multidimensional models that happen to be abstract formal (rigid) systems. Reading and understanding a string of new mathematics results in an update into what we know internally in our models.

One of the most fascinating achievements of the twentieth century was our success in instantiating our own abstract formal systems. Prior to this, mathematics produced models in our heads and we used them to help us explain the world around us. They didn’t have physical manifestations, although we could serialize them to paper and pass them on. But once Alan Turing actuated the notation into a construct that we could create physically, at least one of our formal systems left the abstract world and landed right into the middle of the real one. Turing Machines became physical devices and these ‘computers’ are gradually finding a place in every corner of our lives.

That this was possible reopens the question of whether or not these types of abstractions are just purely abstract or are just generalized manifestations of our reality. That’s a pretty long-winded way of saying that our thinking abilities could be entirely bounded by our universe. That we might not be able to create internal models that are purely abstract. That we can’t think outside of the box, if the box is our own physical reality. However, notions like ‘infinity’ and ‘perfect’ do appear to be disjoint from our reality, so they’re unlikely to really exist, just approximations to them. We can go a little outside of the lines, but it leaves one wondering about how far is too far?

To control our computers we create software, but in a funny sense our creation of software seems diametrically opposed to our own handling of our internal models and serialization. That is, we first create the serialization (the code), then we set it running to achieve a time-based multi-dimensional behavior. So the running software is a model just like the ones we have in our head. The code can still be transferred from one machine to another, communicating through a huge variety of ‘programming languages’ but at some point in the future it might be possible that the runtime model of the system could be updated rathe replaced and reloaded as we do currently. That shift from just instantiating new instances, over to communicating between long-running models could possibly amplify the usefulness of software systems by orders of magnitude. We’ve already started down the road to dealing with ‘big data’, but we’re still focused on the serializations and not yet the construction of massive models.

The Sapir-Whorf hypothesis suggests that our thinking may be altered by the languages we use for communication, but one might easily think that it is the other way around. Some technical subset of language contains significant localized model primitives, which no doubt drive our internal model construction in very specific ways. Thus if one becomes a lawyer and reads a lot of law, eventually they’ll start ‘thinking like a lawyer’. The primitives they communicate with are very specific sub-models that drive how they build up other models. So, if a language focuses more heavily on a eclectic subset of primitives, as it is used for communication, it will affect how people build their internal models. We see this also with computer programmers. They spend their days constructing formal systems, and they often apply that back to the informalities of the world around them giving them a rather inflexible black and white perspective on the rather grey aspects of the world around them.

Another interesting aspect to this perspective is that mathematics, as a set of communicate symbols, could be seen as no harder to learn than any other language. At the high level that is essentially true, but some aspects of formal systems do rely on difficult underlying notions; Douglas Hofstadter called them ‘Strange Loops’. They seem to contradict our build-in intuitional models, causing contradictions. However once enough of these are strange loops are resolved and understood, and are freely available to a person, the rest of mathematics really is just about depth. The models get larger (mathematics is huge), but are still communicable. A question often arises about whether ‘programming’ itself is mathematics, and this too resolves itself. Since the programming languages describe a formal system in a similar manner as mathematical systems describe a mathematical branch, both of them share a great deal of commonality. Theorems for instance, are roughly analogous to libraries of code in that they are higher representations of significant underlying detail. Larger primitives that can be welded more effectively. And since they are both bound by the same constraints as any formal system, finding complete sets of non-overlapping primitives in both models is nearly identical. But at the same time, they are also too different languages, with the same gulf between them as say English and Chinese. Knowing both is possible, but translating between them is harder (although since they have significantly less ambiguities then vocal/written languages much of the difficulty is reduced).

This perspective that we take multidimensional data and serialize it to to a linear stream provides an interesting framework around a large number of seemingly diverse interactions between people. It not only helps characterize our communications but it also gives us an insight into how we think about the world and how we share that amongst ourselves. It is a simplification, but it does help to bind together language, mathematics, computers and our own intelligence. And it gets back to that recurring underlying duality that comes up over and over again between ‘static’ and ‘dynamic’, between ‘nouns’ and ‘verbs’, and between ‘data’ and ‘code’. As we dig into our reality and generalize what we have learned we often return to the same fundamental patterns, which one could easily suspect are really manifestations of our physical reality.