Thursday, October 26, 2023

Two Generals

There are two armies, encamped on different sides of a large city.

If both armies attack the city at exactly the same time they will be victorious. If only one army attacks, the city defenders can wipe it out.

One of the generals wants to notify his peer in the other army about when to begin the attack. He sends out a message “Tomorrow at 6am”. But he receives no reply.

He has a huge problem now. Did his messenger make it to the other general? Maybe he did, but the returning messenger was captured and killed. Or maybe his messenger never made it. As he was sneaking around the outskirts of the city he met an untimely end.

So, tomorrow morning, should he attack as he said he would, assuming the messenger was successful? Or should he not attack?

Whether the message was received or not is ‘ambiguous’. The general cannot know which of the two possibilities is true; that his message didn’t make it or that the reply didn’t come back. He doesn’t have enough information to make an informed decision.

Yet the fate of the battle rests on reliable communications…

I’m sure that many readers have various suggestions for ways to remove the ambiguity. For instance, you could send other messengers, lots of them. But if they just disappear too, then the ambiguity is still there. You could try to establish waypoints, so the distance of the communication is shorter. But if there is still even a tiny corridor where the defenders reign supreme, it makes no difference.

What if instead of sending “tomorrow at 6am” you send it as a question “What about tomorrow at 6am?” Then if it is intercepted the general isn’t compelled to act. But now the other general has the exact same original problem with their reply. It swapped the problem, but it didn’t go away.

Clever people might suggest a different, alternative medium. Like flags or something, but the city is large enough that there is no guarantee about visibility, and if you used smoke or balloons, the defenders could just clear it away. The medium isn’t the problem.

The thing is, there is no perfect, always-working scenario. Where you need information but there is only an ambiguity, no matter how small you shrink it, it still remains. The ambiguity is the information.

Worse is that this is a fundamental physical constraint imposed by the universe for any sort of distributed communication. All distributed software, where separate computations need to synchronize on any sort of non-perfect medium, is bound by it. If you have a client and server, or a bunch of peers, or even two processes using a less-than-perfect medium such as files, no technology, protocol, or magic bullet will make the conversation 100% reliable, and even if it is 99.9999% reliable, there is still some sort of ambiguity in your way.

At your very best you could shrink it down to say a 1 in 100 years likelihood of failure, so you could not see it go wrong in your entire career, but it will still go wrong, someday. And that is what makes it so different from a regular computation. It is not deterministic.

Short of some unexpected external event like a flood or gamma rays or a hardware defect or something, all of the other computations will work perfectly each and every time they run on the computer. If they work for their full context, you can always assume they will always work. Oddly, we treat them as 100%, even though the physical nature of the computation itself is subject to adverse advents, the software itself, as a formal system, is not.

There is and will always be a huge gulf between 100% and 99.9...%, between deterministic computations and non-deterministic (in the distributed sense, not the language theory one) ones. Ultimately it affects the level of trust that we place in our software.

No comments:

Post a Comment

Thanks for the Feedback!