Sunday, November 15, 2009

Prime Perspectives

It all started with a dream about infinity. Well, not just one infinity, but an infinite number of them, spanning out in all directions. A repeating pattern of infinities splitting off from other infinities. It's a simple pattern, but also a very complex one.

It was intriguing enough to compel me to spend some time exploring my thoughts a little deeper. I was captivated by the overall pattern.

Somehow, after playing around a bit, I associated the dream with the underlying structure for natural numbers (originally I though it might be related to the P=NP somehow).

I suspected that the structure for prime numbers was driven by some type of blending of an infinite number of far simpler infinite patterns; where a new complex pattern emerges as each simpler one is merged into the rest.

Prime numbers play an important role in Number Theory -- one of the most elementary branches of mathematics -- yet they are cast in all sorts of different devilish manners. They are surrounded by myths, superstitions and a long history of obsessions.

They draw a lot of attention because they are simple, yet hugely complex.
"Mathematicians have tried in vain to this day to discover some order in the sequence of prime numbers, and we have reason to believe that it is a mystery into which the mind will never penetrate." -- Leonhard Euler
Because they are a basic building block of our numbering system, they affect Computer Science heavily. Computers rely directly on primes for things like security, but also indirectly they affect all of our numeric calculations. They help to build up the intrinsic errors we live with when using a discrete space of digital numbers to represent a continuous space of real ones.

While primes rarely intercede directly on the day-to-day aspects of programming, they are always there, quietly in the background influencing our number crunching, and contributing to our errors.

What follows in this post are various different 'perspectives' I played with while exploring the world of primes. I'm not sure of their value, nor of their novelty. I suspect they've all been played with before in the past, but I've found little direct reference to any of them.

By necessity, this is a long, long blog entry. I could have broken it up into smaller pieces, but I really wanted to get all of the major elements out at the same time. Posting a bunch of smaller entries, all on the same day is essentially the same as posting one big long one, so I opted for keeping the entire work together.


BASE P

Our standard number system is based around 10 symbols, but programmers are long familiar with the convenience of changing the number of symbols we used to represent our numbers. Binary (two symbols) and hexadecimal (16 symbols) provide well-know examples that are frequently used because they make it more transparent to read some properties of the numbers.

Roman numerals too, have their uses. Although there are normally only 7 symbols (14 if you include differentiating symbols with lines over them) in the scheme, it could easily have been extended to use an infinite number of symbols. However, they likely stopped at a finite number because allowing an infinite number of symbols would ensure inconsistencies in usage (everyone would pick different symbols for any that are undefined, and an infinite number means that there will always be some undefined).

Sometimes an alternative perspective on the same old underlying information is an extremely useful place to start exploration.

With that in mind, I started experimenting around with creating another base for the natural numbers. One which is based around having an infinite number of symbols.

In this case I wanted one symbol for each unique prime number. It is best described as the prime base, or 'base P' for short.

For convenience I will picked the symbols a, b, c, d, ... mostly because they help clearly differentiate the various properties of this encoding (but also because they are easier to type into my blog :-)

Using a rather common perspective of ignoring 1 as not being a prime number, I started by assigning the symbol 'a' to the prime 2. Then 'b' to 3, 'c' to 5, etc. Composite numbers are composed of strings of their prime factors, so 'aa' is 4, 'ab' is 6, 'ac' is 10, etc. To round it out, I used an empty string ("") to represent 1, and and the NULL string (or NULL character) for zero.

Thus, in numeric order (and ignoring 0 and 1 for the moment) we get the following sequence for natural numbers:

a  b  aa  c  ab  d  aaa  bb  ac  e  aab  f  ad  ...

which represent the normal base 10 sequence:

2  3  4  5  6  7  8  9  10  11  12  13  ...

All of the strings are 'normalized'. For any two different types of symbols, say a and b, if they are in a string together then the lessor one should always be written first.

Thus 'ba' is invalid and is actually identical to 'ab'. Both strings represent the number 6. Then 'dea' is actually 'ade'. They are the same, it is just that one is not normalized. 


SOME USAGE

There are a few interesting things that show up within base P.

The first most obvious one is that multiplication and division are very straight-forward. They are just string operations: concatenation for multiplication and deletion for division. For instance 42 divided by 6 looks like:

abd / ab = d

A number is divisible by another, if and only if the same symbols appear in both strings. Integer division is just a simple matter of removing some symbols. Testing for common divisors is just a matter of seeing what subset is the same in both strings.

If multiplication and division get really easy and simple in this new base, addition and subtraction become horrific. A few examples:

"" + aa = c

aabb - d = j

Both of these show that the operations of addition and subtraction are extremely non-intuitive. There is no easy way of doing these two operations. Interestingly this seems to be a swap from the finite bases, where summation is more intuitive and multiplication has always required learning some type of lookup tables.

Of all of the interesting things to do with these numbers, one caught my attention right away. It is best described as 'merging' series.

I started with the series:

a  aa  aaa  aaaa  aaaaa  aaaaaa ....

and seeing that as an 'axis' of some type, I crossed it with a similar axis:

b  bb  bbb  bbbb  bbbbb  bbbbbb  ....

Once I sorted these numbers into numerical order they gave me something like:

a  b  aa  ab  aaa  bb  aab  aaaa  abb  aaab  bbb  aaaaa  aabb  ...

When I placed both axises on a Cartesian-like graph, and then drew arrows from each number in the sequence to the next, I got this really interesting 'search' pattern:




This implies that the sequence is working it's way through each and every possible combination of the two axis in base P.

It's an interesting result given that I started out by looking for patterns composed of an infinite number of patterns branching off to infinitely.

This pattern comes from just a pair of infinite patterns, combined. If I start adding in the other axises, one by one, this could easily become a complex and diverse pattern that fully represents our basic number system.


PRIME PRODUCT CHART

Playing around some more, I decided to create a 'chart' with all of the base P strings, ordered numerically.

I placed the natural numbers in base P on the vertical axis, and each prime symbol itself on the horizontal axis. Because I did the original in a spreadsheet, I put the numbers on the vertical axis headed downwards, and the prime symbols on the horizontal axis headed right. But it could have been oriented horizontally as well.

My original table was vertically labeled AxisA, PlaneAB, DimABC, etc. as I was treating each repeating symbol sequence as an axis, and the whole sequence together as a search pattern throughout these various axises, as they get added into the sequence one by one.

While it was interesting chart, it was really noisy and hard to understand. So I simplified it, eventually to D1, D2, D3, etc. And then later to just a, b, c, .... Then I started highlight specific cells to see if I could find a pattern.

It was during the processing of simplifying the chart that I suddenly realized that there was a very simple deterministic pattern that was clearly flowing though all of the entries (I also realized that I had made a few mistakes in my original input).

After looking at it a bit, I simplified it into what I like to call the "Prime Product Chart". It is an easy and simple representation of the frequency and structure of prime numbers in our natural numbering system:




When I looked at the above chart, I knew I was onto something interesting. Instead of the complex patterns within patterns I was looking for, I had stumbled onto something far more simple.

This chart basically marks out each of the prime factors for each number. That is, where the number equals 0 mod P (is divisible by P) for all of the different primes in the columns. When there are no possible factors, a new prime gets created and added into the pattern.

Looking at the columns, we can see a clear and easily repeating pattern where the distance between each of the marks in the columns is the value of the prime itself. This is a pretty strong pattern, and at least indirectly it has been known for thousands of years.

The Greeks first discovered this as the algorithm named "the Sieve of Eratosthenes". It has been a pretty well-tested over the centuries (although historic computers -- even those just over a hundred years ago -- tended to have personal problems, make mistakes, get hungry and take occasional lunch breaks).

The algorithm generates the set of all primes for a fixed range.

It iterates through the series of numbers, marking off those which are known to not be prime. It does so in each iteration by marking off the factors of the last known prime number.

The first pass marks off every other number, the second every third number, the fourth every fifth one, and so on. One pass for each prime number.

In that way, eventually "all" of the factors for "all" of the numbers have been marked off to insure that any and all composite numbers are properly removed. What's left over are prime numbers.

The Prime Product chart makes the algorithm obvious. The chart can actually be interpreted as just listing out each different iteration of the sieve. Each column is one pass through the algorithm, and where each row is unmarked that is where a new prime is created.

The next prime in the algorithm, for the next pass, can always be found by moving forward from the last prime until a row of all unmarked cells is reached. It's a simple deterministic step.

From the chart we can see that the location (or not) of primes is a direct result of how each of these different columns interact with each other. As they gradually align together, new symbols are needed to fill in gaps in the pattern.

In that way, it is clear that primes are not laid out indiscriminately. They fall where they fall because of what is essentially a negation of the Chinese remainder theorem. When the next number has nothing in common with any of the previous symbols, a new symbol is necessary.

Where I marked the cells, I also added in the number of times the same symbol repeats as one of the factors 1, 2, 3, etc. There is clearly another pattern buried here with the number of repeated factors, but so far I haven't done anything fun with it. There were so many other interesting questions to play with first.

Getting back to it, a new prime occurs whenever there are no other possible symbols to fill in. Gaps always keep occurring (even when there are a huge number of existing primes), so that new primes are always added into the mix. The pattern doesn't stop (or even slow down that much). There are clearly an infinite number of primes that will get created.

From this chart we can make some interesting observations.

The well-known case where two primes fall closely together is a fairly obvious consequence of the fact that a lot of the earlier primes start coming closer into phase (being at 0 mod P at the same time), but at some point in the pattern the heavily repetitive factor 2 becomes the only reason that the primes are not one right after another. Thus, it enforces a minimal distance.

We'll also see later that it is entirely likely that any two closely related primes can appear at any location within the number system, e.g., that there are an infinite number of prime pairs.

In a way we can visualize any of the unmarked cells as holes through which new primes might be created, a sort of "filter effect",  but we'll get deeper into that later in the post.

Note that there is an 'a' symbol in every other string in base P (and a 'b' in every third, etc.). So the sieve itself is directly represented in base P.

Another point that will get way more interesting later is that like the two base P sequences I talked about merging earlier in this post, it is clear that the natural numbers go through each and every permutation of all of the currently existing symbols.

That is, for any set of normalized symbols Pa*Pb*...*Pm, where Pa can be the same prime as Pb, there is one and only one corresponding natural number, and there is always only one number.

The numbers in this sense, lay out all possibly permutations for the normalized strings in base P.

 
ROWS AND COLUMNS

As I explored more deeply, I found a number of interesting things about this chart at the row level.

One easy thing is that it is clear that a prime only exists when there are absolutely no other earlier prime factors are available. That is, a prime occurs when none of the lower primes are 0 mod Pn, where Pn is the nth prime less than the new one.

The gcd function finds lines where each parameter P is 0 mod P, so we need an analogous function, possibly called the lowest common non-multiple (lcnm?) such that:

lcnm(v1, v2, v3, ..., vn) -> 0 != mod v1 && 0 != mod v2 && ... && 0 != mod vn

In this way we can say that:

Pn = lcnm(P2, P3, ... Pn-1)

The nth prime is the lowest common non-multiple of all previous prime numbers.

It's an extremely simple functional representation of the nth prime number (although it nicely glosses over the actual complexities of the numbers themselves).

Another thing I noted was that the minimum distance between prime numbers is always 2, since it is the most common factor. Interestedly enough, the maximum distance between primes changes radically. I watched it go up to around 150 for a small set of numbers, but it clearly goes higher. It is an easy assumption that it will grow to an infinitely large size.

That the distance between primes continues to grow is an important property of the structure of primes. Intuitively it tends to lead to incorrect assumptions.

I had always expected that primes are spaced out a lot farther from each other, as the numbers got bigger. From imagination I would have assumed that prime numbers are few and far between at the really high ranges.

But oddly it just isn't like that. They seem far more dense that one would guess. There are a huge number of massive primes. A surprisingly large number. There are problem a few massive ranges of "low density", but those are followed by ranges of more-or-less normal density (high-density) prime numbers.

The prime-counting function (Pi symbol) attempts to calculate a boundary for prime numbers, but the chart itself is also quite useful for showing how constrained and how dense the overall set of primes really is. There are some structural aspects to the chart that can really help illustrate the patterns.

There are always more symbols entering, but the distances are expanding at more or less the same rate. The density of the primes grows rapidly, but over a rapidly growing field of numbers.


PRIME BLOCKS

What is very noticeable in the chart is that for any N prime numbers (or N columns), there is a block (of rows) in the chart that is exactly P1*P2*...*Pn cells in length.

We can call these N by P1*P2*...*Pn structures 'prime blocks'. As each one is essentially a basic building block coming from the Nth prime number.




That is, for the prime 2 the blocks are every other number. For the prime 3, the blocks repeat every 6 cells. For the prime 5 it repeats every 30 (2*3*5) cells.

Within the block, there are essentially two types of primes. Those that are inherently in the definition of the structure (primes <= Pn) and those that are essentially newly created, above or outside the structure.

I will refer to these as internal and external primes.

As we look at larger blocks in the chart, we see that the thickness of the block grows really slowly by one prime each time, while the depth grows really fast, increases at a massive rate by the product of the next prime.

When I first looked at prime blocks, I missed the fact that as the blocks get larger, the external primes repeat themselves more frequently. While the internal primes form a more-or-less deterministic pattern, the interaction of the different frequencies in the external primes makes for an increasingly complex pattern.

The blocks are essentially stretching rapidly, each time we include another prime, a point which is interesting. Important, because although while we've only explicitly calculated N prime numbers, we've actually tested (and  factored) P1*P2*...*Pn numbers.

That is, in the seive algorithm, as we generate larger and larger prime blocks, we get a sort of magnifying effect out into the number system, showing as all of the other related primes within that block.

Another thing to note is that prime blocks are finite structures for a fixed number of N primes. That is, they exist and get replicated over and over again. Some of the higher external primes will extinguish some of the 'possible' lower ones, but only when they have no common factors. In a way, we can see the blocks themselves a type of base 'filter', and then the external primes add another layer on top of that.

Judging by the overall density of primes, the extinguishing effect from external primes is relatively in-frequent, mostly primes that start in one instance of a block have a good chance of staying prime in any of the following instances (a point that I have been waffling on, so it deserves much more investigation).

At the end of every instance of an N block of primes, the last entry, the one at P1*P2* .. *Pn is always a composite number whose that is divisible by every one of the internal block primes. Lets call it the 'last index' number.

In base P that number is:

a  ab  abc  abcd  abcde  abcdef  abcdefg ...

The last index number is interesting because it delineates the end of the block, but also because of the numbers both on both sides of it. Since it is the last number in the first instance of the block, will call the next number the 'block-start' number.

Often in many cases, the block-start number is a prime.

Looking back at the chart we can see that 7 and 31 are very obvious block-start primes. The 6 pattern repeats on a few different primes until it gets cut down by a couple of powers of 5 (an external prime for the 2*3 block).

Now, it is extremely interesting to note that all this time that we have been assuming that 1 is not, by definition, a prime number, but it is a block-start number (shared for all of the blocks).

Initially it might seem like block-start numbers are always prime, since at least they are trivially co-prime with each of the internal prime numbers. However, the external primes easily make the overall structure significantly more complex.

So, for larger blocks, it turns out that block-starts aren't always prime. The smallest example where they are not is 30031, which is a block-start number for 2*3*5*7*11*13 (abcdef), and is divisible by the external primes 59 and 509.




Block-start numbers are also known as Euler Number and have been around for a long time.

My interest in them comes from mistakenly not realizing that some of the externals could actually extinguish them from being prime. This mistake lead to some very excited effort in trying to generate a massive block-start prime (I generated one with 116 million digits using only a home computer and a week's worth of computing). Oddly, while the number may be prime (or it may not), the sheer size of the number makes it hard to prove, one way or another.


MORE ABOUT BLOCKS

A huge question in Number Theory has been to be able to accurately count the total number of prime numbers. Many great mathematicians have thrown themselves at this problem, always with a limited degree of success.

Perhaps the key issue is that the increasing size of the prime blocks makes the actual number of primes vary significantly for each of the block ranges.

There might not be a single function for the whole range, but possibly only ones for each of the the prime-block ranges (and then a limit as the blocks go to infinity).

I figured it was an interesting issue, but really only from the perspective of trying to show that the arrangement of external primes in very high-up ranges wasn't essentially random.

So I tried taking a new look at this problem but only as trying to find the count within a discrete closed range, may actually produce less complex results.

For instance, we can count the number of primes within specific prime blocks:

For block 2: There is one prime (assuming we don't count 1), which is 50%
For block 6: There are 3, which is 50%
For block 30: There are 10, which is 30%
For block 210: There are 46 primes, which is 21.9%
For block 2310:There are 343 primes, which is 14.5%

The percentages start cleanly, but quickly built up to be 'less obvious' numeric patterns. As the blocks grow larger, the numbers look weirder.

Interestingly enough, we know that the Nth block which is P2*...*Pn in size is actually the N-1 block repeated N times.

If we consider the prime-block itself to be a filter of some type, then for each instance of the filter we can work out the total number of "possible" prime numbers allowed in each block (certainly in terms of the previous smaller prime-blocks, if not in terms of some complete formula).

The big problem comes from having to subtract the correct number of external primes that extinguished any of the possible primes.

Although we could calculate the density of the external primes, we need a simple density function that takes into account overlaps (such as 59, and 509, both extinguishing 30031).

I highly suspect that this calculation is possible, but lately I haven't had the time to dig into it.





PRIME GENERATION FORMULAS

Another interesting issue that pops up all of the time is whether or not we could find one single continuous function for calculating prime numbers. For those not familiar with Binet's formula, this may seem like a near impossible task, but we actually been able to take certain classes of complex recursive functions like the Fibonacci sequence, and create functions from them.

It all starts with being able to get some type of recursive definition for the sequence.

Because we can see the structure of the primes, we can write a program to start at one position and find the next one. We can easily write a program to implement the sieve for instance.

Programs (to some degree) share the same expressive level with recursive formulas, so we can write a recursive formula for creating primes.

We start by stating that a prime number comes from the previous prime, plus some "gap" between the two numbers:

Pn = Pn-1 + Gn(Pn-1)

To make things easier, we want to be able to test to see if any number is not equal to zero modulus some prime. Because we want to use this test for recursive purposes, if the number is not 0 mod P, then we return 1, otherwise 0. In this way we can use it as a multiple, to drop out different terms if necessary. I think the following works:

T(x,m) = Ceil ( ( x mod m ) / x )

If x is 0 mod m, then the fraction is 0 / x, which is 0. Otherwise it is 1. With this function in mind, we can create a recursive function that tests a particular column in the chart to see if all of the cells are all != 0 mod Pk, for all Pk:

FTk(x,Pk) = T(x,Pk) + FTk-1(x, Pk-1)

This is a classic tail-end recursion that evaluates to 0 if all of the recursive levels are 0 mods. From here we can then create a function to calculate the gap simply by trying each row, and if it fails, adding one to the result and moving (via recursion) to the next row.

Gk(x) = 1 + T(FTk(x,Pk), 2) * Gk(x+1)

The outer test function T simply forces the results of FTk to be either 1 or 0, pushing us to the next row in the gap.

With a few initial starting values:

G1(x) = 1
G2(x) = Ceil(x mod 2 / x)

FT1(x, 1) = 1
FT2(x,2) = T(x, 2)
FT3(x,3) = T(x,3) + FT2(x,2)

We have a complete recursive representation of prime numbers.

Now, just because we have a recursive representation doesn't mean that we can find a generator function for this, or that we could find a formula for the Nth base prime.

While Binet's formula does work for the recursive Fibonacci series, the difference is in the fact that the recursion in the Fibonacci sequence is simple (and not entirely necessary, although it is the easiest way to see it). The recursion in the prime number formula is extremely complex and self-referential.

Oddly, it all comes down some issues about information. Although it's not explicit, the Fibonacci sequence contains within its composition a structure that can be represented in multiple ways. Once by a simple recursive function, and again by a more complex one for the Nth number that is based on golden ratios. Although the secondary representation is far from obvious, the "information" to make it work is implicit with the recursive formula and the number system itself (in the same sense that many computer programs depend heavily on the operating system to be complete).

Since it was non-intuitive, it certain opens up the possibility that there is something similar available (but unknown) that would work with the recursive prime function as well. Even more interesting is that this indirectly states that there is more to our axiomatic abstractions, then just the axioms themselves. In a way, they sit on a "consistent" and "logical" foundation, which itself has an intrinsic about of information, which indirectly controls the representations of the axioms and allows (or does not allow) for alternative isomorphisms.

We can see this as the number system itself (as a formal system) allowing for the mapping between Binet's formula and the Fibonacci sequence.

Getting jiggy with it, if we extrapolate this out a bit farther, we can draw a conclusion that there are likely an infinite number of alternative isomorphic representations, any of which can be built on some intrinsic information contained by the rather large space of an infinite number of possible formal systems.

That opens the door to easily saying that just because we have not found an obvious alternative representation of some mathematical object, doesn't in any way imply that there are not some other "useful" ones (there are an infinite number of non-useful ones :-). Or in short, for those that are interested in it, I wouldn't write off the possibility that P=NP just yet, no matter how counter-intuitive the problem appears.


PRIME SUMMATION CHART

The Prime Product Chart proved so useful, that I started searching for a similar type of representation for addition and subtraction.

Realistically, the Prime Summation Chart should have a similar structure to the Product one, but be focused on addition. The Product chart broke down the factors, essentially dealing with the sub-structure of the numbers. In a related fashion the summation chart will break down the base numbers based on partitioning them into prime terms.

It is believed that all numbers can be decomposed into either the sum of two or three prime numbers. With that in mind, if we are looking for arithmetic partitions, we should concentrate those that only involved prime numbers and those that involve a minimum number of them.

The Prime Summation Chart should be similar to it's multiplicative cousin, so once again it's the numbers going down and the lowest prime (in the partition) on the right.

Now with addition since there were so many different partitions I had to narrow it down somewhat to the more interesting ones. This version of the chart sits on the middle numbers for the next round.

I went that way initially because I suspected that partitions involving 1 were going to be trivial, and thus not interesting.



This is similar in many ways to the product version, except that it is using summation.

Wherever possible I tried to represent the numbers with the absolute minimum number of primes, but always primes, since they are the main building blocks.

Although, for each new row there were many different possibly partitions,  in the list of possible partitions I high-lighted the outside elements in orange, and the inside one in grey

Interestingly enough, in the chart is how the patterns of primes keep repeating downwards and to the left. This attribute actually makes the chart fairly easy to generate and verify.

Still, while this chart shows some interesting patterns, it is still fairly complex.

If choosing the center of the possible partitions produced something complex, it might be interesting to see what the trivial choice for partition produces:




I dropped most of the triple partitions, and always picked the most trivial partition for the next row (which I high-lighted in grey).

It is really interesting to see how the patterns of descending numbers are reproducing themselves again and again in each column. The nth prime column just starts the pattern all over again for each row. Generating entries in this table is actually a simple, manual process.

Initially most numbers are either a prime or a product of two primes. But 27 and 35 both don't have any two prime number representations.

The question, is whether or not there is always at least a three prime representation or if at higher numbers it will degrade to four, five, etc.

The answer to this lies in the fact that each new column repeats the same pattern over and over again. It is this property that controls how the higher numbers get generated.

The most obvious point is that 27 and 35 only become partitions of three because the gap between primes is greater than 4. That is, the next numbers are always going to be X plus the last prime. So it's P, then 1 + P, 2 + P, 3+ P, etc. Except that there is not single one-prime representation for 4, so it must become (1 + 3) + P.

Any time the gap between primes is equal to 4, that number can only be decomposed into three prime numbers, not 2.

So, extrapolating a bit, what happens when the gap between numbers reaches 27? If 4 was the first non-prime number that needs to be represented as 2 prime numbers, then 27 is the first prime number that needs to be represented as 3. Thus the smallest partition is going to be:

1 + 3 + 23 + Pk

Where k is some value such that Pk - Pk-1 >= 27.

From this we can guess that this particular circumstance will continue infinitely. Each new entry that can't be represented with some limited number of primes will force some new entry into the number system. So for any N, there is always some number that cannot be represented with less than N prime numbers (thus showing that some of the original guesses over the years were probably wrong).

What might be interesting is whether or not these numbers fall on even or odd numbers. It wouldn't surprise me to see them flip flopping between evens and odds, or just staying with the odd numbers themselves (there is an existing proof in this area which may or may not conflict with the above).

UPDATE:  The range between the primes 2971 and  2999 may be the first one greater than equal to 27. The number 2998 can be expressed as: 1 + 3 + 23 + 2971, but is there also some prime decomposition that is smaller (with only 2 prime terms?)? If so, then the Goldbach conjecture stands, if not then it is wrong. It's easy enough to brute force my way through this search, so hopefully in a few days I'll try, and then update the blog again.My guess is that there won't be another smaller decomposition into primes AND the number 1 + 3 + 23 + 2971 + Pk, where Pk - Pk-1 >=2998 will be an odd number. Thus implying that the minimal decomposition (into 2 or 3 primes) is neither true for odd nor even numbers.

UPDATE2:  Interestingly enough: 2998 = 2 * 1499 = 1499 + 1499. And the smallest decomposition is 2998 = 29 + 2969. Also the smallest composite that is 27 numbers away from a prime is 1354. Oddly, like 2998 the following relation also occurs: 1354 = 2 * 677 = 677 + 677.

The first gap of 35 occurs between 9551 and 9587. Also, 9586 = 47 + 9539 = 2 * 4793 = 4793 + 4793, which shows that it has the same underlying decompositions as the case with 27 (which is exactly as we would expect).

So all of these numbers have the property that Ck = 2* Pi = Pi + Pi. In that way, even if 27 and 35 can only be the sum of three primes, their pattern multiples (1354 and 9551) are back to being expressible with 2 primes. Which (loosely speaking) means that the conjecture is true (although I'd like to state it as every number can be minimally stated as the sum of 1, 2 or 3 primes).

UPDATE3: While I'm here: there are no two-prime decompositions of 27 and 35, because both numbers are odd, and except for 2, all primes are odd numbers. To be decomposable into two primes Pa and Pb, the following must hold true:

Pa = 27 - Pb   (or 35)

And since Pa can't be even (divisible by 2) Pb must be even (and thus not a prime).

(the concept of 'odd' numbers is clearly useful beyond things just being a factor of 2, so is it possible to extrapolate that to higher numbers, say 3, for instance? Could be call that "trodd" ? Or is it just Friday?)

UPDATE4: This is my last update, I promise. I just started thinking that every positive natural number (k) can be deterministically generated (in sequence) from the following rules:
  • The number is a prime number Pi
  • The number is an Nth entry away from a prime number, either as the last prime plus another, or the last prime plus two others. So Pi + Pa or Pi + Pb + Pc
  • The number divided by two is a prime number, so Pi + Pi
Then a stack-based algorithm to generate all of minimal prime sums for all of the numbers is:
  1. Check if the number is prime, if so use that number and reset the stack
  2. If not, take the last prime and the last numbers on the stack
  3. If the last numbers are one or two, use them plus the last prime
  4. Else divide the number by two, and use it twice
  5. Finally: add the representation (numbers) to the stack
The only question left over, is we know this appears to work for expansions of 1 + 3, but does it work similarly for the next two-sum expansion 1 + 5 ? The first gap where that shows up is between 89 and 97, where 95 = 89 + 1 + 5, and there are no smaller representations. If everything holds true, then

95 + Pk = 2 * Pi

where both Pk and Pi are primes, and there are no prime numbers in the range (Pk, Pk+95]. One quick check shows that this doesn't occur in less than numbers less than 25,000. Maybe I'll need another update after all :-)


SUDOKU

Getting way back to generating large prime numbers, I suspect that it is entirely possible to work backwards from some point in the number space to finding any nearby associated prime numbers.

Although primes clearly have a pattern, to get to the Nth element in the pattern, one has to take in account all of the "information" from the previous N-1 elements.

For massive numbers, this is a huge amount of information. However I keep thinking it might be similar to solving a Sudoku puzzle.

In a Sudoku puzzle, there is a relationship that each number only appears uniquely per row, per column and per square (the major ones).

The puzzle starts with a few numbers filled in (which we can see as information). In order to solve the puzzle, readers can utilize all of the intrinsic relationships in order to absolutely assert the location of a specific number. It's not a puzzle where one needs to guess, you can "absolutely" be certain that a number appears in a specific location if and only if you have made the correct inferences from all of the available information.

As such, it is deterministic, and once someone starts the puzzle (assuming that it really is a valid Sudoku puzzle), it can always be correctly finished. No guessing, no wrong answers, no ambiguity.

Within the number system there are a larger number of 'partial information' holders that can be used to say something relative about a range of numbers. We know all sorts of relations that occur between numbers that do not need to be tied to all of the intrinsic structural knowledge.

That is, we could drop into a range of numbers and starting at one specific point, make associations between the various different sets of numbers. We can make some high level associations, but also some low-level ones, such as that some number is a power of 2.

If we can build up enough information, over a large enough range, we should be able to correctly infer that some of those numbers in this (increasing) range are prime numbers.

This is very similar to how I produced the Prime Product Chart for the block-start number 30031. I didn't have to compute everything up to that number in order to be able to diagram the number itself. I just started by placing 30031 in the center and worked my way outwards on both sides. Each time I factored the nearby numbers, I updated the chart to show there revised structure.

Of course for that example, I cheated somewhat. I had access to a UNIX factor program, which quickly and conveniently produced the correct factors, that I added to the graph.

Still, if factor can be seen as producing an "absolute" reference to some information (factors in this case), I suspect that there is a "relative" variant that could still provide reasonable information that we can use.

Even small operations are expensive on multi-million digit numbers, so to be practical we'd have to produce some pretty strong information based on very little related information. With enough of it, we could re-construct the surrounding numerical space, and then draw inferences about the elements in that space.


FINAL NOTES

There is more. Lots more. I manged to get the major points out for each of my key focuses, but it'll probably take me a long time to sift through the details and find all of the rest of things of interest.

My biggest problem is being able to find the time to explore this world of primes to the degree that my curiosity is calling for. It's nice because it is simple and accessible, but often I find I have to think about things for a long time, or spend a lot of time reading books on topics like Number Theory.

Hopefully in the new year I'll get a chance to do some follow up work. I'd really like to try experimenting around with finding inexpensive "relative" relationships. Some intuitive inner-voice suggests that that might be very workable. Or perhaps, I might just wait again, until the next weird dream wakes me up in the middle of the night. I managed to get a fair amount of traction from that last one.