Even the `computable real numbers' are quite misunderstood. Most mathematicians reading this paper suffer from the impression that the `computable real numbers' are countable, and that they are not complete. As I mention in my recent book, this is quite wrong. Think clearly about the subject for a few days, and you will see that the computable real numbers are not countable, and are complete.

I do not get this. Does he imply that "computable real numbers" are more than countable? Surely not. Maybe less than countable? And what is meant by "complete" here? I am confused...

I think he's not being clear. There are computable sets that are uncountable (e.g. Cantor's thirds). And if you take the real number line as a given so that you may choose any random x from the real number line, then there are an uncountable number of computable numbers reachable from a random real x via the simple operation x+c for every countable number c. If he means that the set of all c is uncountable, then, yes, he's mistaken.

As for the "complete", I think he's referring to topological completion... i.e. that analytical stuff about every Cauchy sequence converging... I'm fuzzy on the details but I intuitively categorize it as meaning that you can have infinite decimals for any constructive sequence you care to name and that all the operations of calculus that we care about (i.e. the tools required to have an "analytical" space where you can take derivatives and integrate) are well-defined and work just fine.

The p-adic numbers are a perfect example of this. Sure, there are lots of real numbers that cannot be "mapped" to the p-adics... but it really doesn't matter because a) you could never name a particular real number that cannot be mapped to the p-adics and b) all the things we care about, in terms of arithmetic etc. can be done on the p-adics just fine. And, I just discovered tonight (this is why I'm up so late) how the p-adic numbers are, in fact, algebraically closed (thus, equivalent to the complex numbers... I'm speaking with loose terminology here because I'm not a prof. mathematician).

The secret is that the log (base p) function can be used to define a metric over the p-adics. What is blowing my mind right now is that the log function is formally equivalent to that used in information theory and the physics of entropy (-log_p |x|_p). What is also blowing my mind is that there is no imaginary unit... yet you still have an algebraic closure. We have been drilled that if you want algebraic closure (roots and logs of negative numbers, etc.) then you have to just accept the imaginary unit, there's no way round it. Looks like the p-adics in C_{p} are a way round it.

I predict that - within two centuries - school children will be learning arithmetic with p-adic numbers. I would say one century but we're still under the stone age dinosaur government school system... *sigh.

Training of the young is like that in secret societies---immersion in the cult involves intensive undergraduate memorization of the standard thoughts before they are properly understood, so that comprehension often follows belief instead of the other (more healthy) way around. A long and often painful graduate school apprenticeship keeps the cadet busy jumping through the many required hoops, discourages critical thought about the foundations of the subject, but then gradually yields to the gentle acceptance and support of the brotherhood. The ever-present demons of inadequacy, failure and banishment are however never far from view, ensuring that most stay on the well-trodden path.

The large international conferences let the fellowship gather together and congratulate themselves on the uniformity and sanity of their world view, though to the rare outsider that sneaks into such events the proceedings no doubt seem characterized by jargon, mutual incomprehensibility and irrelevance to the outside world. The official doctrine is that all views and opinions are valued if they contain truth, and that ultimately only elegance and utility decide what gets studied. The reality is less ennobling---the usual hierarchical structures reward allegiance, conformity and technical mastery of the doctrines, elevate the interests of the powerful, and discourage dissent.

There is no evil intent or ugly conspiracy here---the practice is held in place by a mixture of well-meaning effort, inertia and self-interest. We humans have a fondness for believing what those around us do, and a willingness to mold our intellectual constructs to support those hypotheses which justify our habits and make us feel good.

I'm actually a little more cynical on this point. This is based on my own experience in my field (computer engineering). The Establishment (i.e. the management hierarchy, corporate and the political orbit of the upper technical echelons) really is heavily interested in controlling the market in every field it touches... that includes even mathematics and physics. I think the same thing that is going on with these ridiculous particle accelerator Easter Egg hunts also goes on in mathematics. What do you think Grigori Perelman was protesting by turning down the Clay Institute prize money??

Basically, the Establishment doesn't like any new revolutions in any field. So, one way to keep people in line is to keep them busy about solving "important" problems, where important is defined as all the problems that the old codgers took up in their youth and never cracked... but they have mastery of the general area, which means that they're still the top dog as long as people keep talking about solving those problems. So we end up with generations of mathematicians hurling themselves against the same narrow list of "important" problems.

Normally the concentration is 0.720%; these samples had only 0.717% – a significant difference. This discrepancy required explanation, as all uranium handling facilities must meticulously account for all fissionable isotopes to assure that none are diverted for weapons purposes. Thus the French Commissariat à l'énergie atomique (CEA) began an investigation.

It's like having your mind expanded. Looking at the Sun and realizing that this supposed "atomic bomb in the sky" is really just a giant planet like any other that has been "exalted" due to a massive accumulation of electrical charge in its ionosphere (what we see as the Sun's photosphere) and which is not only intimately electromagnetically connected with all the bodies within the Solar System, but is also connected upwards to the rest of the galaxy which, in turn, is connected to the rest of the Universe... well, it's mind-blowing.

And I think that it's not a coincidence that our cosmology teaches that we're on a rock hurtling through empty space, disconnected from everything else by everything but the tiniest, most indifferent thread of gravitational pull... and our culture exudes a sense of nihilistic isolation at an individual level. Which causes which? I don't know, but I don't think it's merely coincidence. We have never been so disconnected from one another.

On a separate topic, real mathematics:

Will this central lesson ever sink in? The purpose of theory is to organize phenomena and nothing else. Theory must be the slave of the data. Today, many scientists pay lip-service to this but then turn around and do the very opposite.

Everything is traveling through time. That should make it clear why time-travel in the fictional sense is impossible. Time is nothing else but the causal relation.

Because time is the causal relation, it too is governed by the law of conservation (causality implies conservation). The conservation law is the essence of physics - if matter cannot be created or destroyed, then whenever the distribution of matter is altered from equilibrium, a surplus and deficiency are created simultaneously. For example, if you mount a fan in your window and push air out of your house, this increases the pressure outdoors and decreases the pressure indoors by precisely the same extent (per volume). If you move a billiard ball from one end of a billiard table to the other end of the billiard table, the density of billiard balls on the end from which it was removed must decrease and the density on the end to which it was moved must increase.

Perhaps not quite as obviously, this same principle operates in time. If you strike a billard ball and it moves across the table unimpeded, no motion is deducted from the ball, except by friction which we are neglecting. If you strike the exact same ball with exactly the same force, angle, etc. and place another billiard ball in its path, something different will happen to the struck ball - its motion (energy) will be reduced when it encounters the ball in its path. In order to impart motion (energy) to a body, motion (energy) must be given up to exactly the same extent by the causal body.

By definition, the effect cannot be the cause, so time travel is not even coherent, let alone possible.

-----

Consider two photographs:

In the first photograph, there doesn't appear to be much going on. There is no change occurring or whatever change is occurring is happening very slowly. In the second photograph, there appears to be a great deal of change or motion occurring. But if we think of both photographs as an "instant in time" - that is, think exclusively of the spatial configuration of the bodies - it is merely our mental prejudice that assigns a great deal of activity to the latter photo and very little to the former photo.

What this tells us is that the spatial configuration of bodies is not all the information required to understand their evolution. Something more, something "hidden" is required to understand the evolution of real systems. This "hidden" thing can be called "the laws of physics" or whatever you like but the problem here is that laws of physics are merely conceptual, whereas whatever governs the unfolding of reality is itself as real as the bodies it governs.

It won't do to invoke more bodies of the same nature as the bodies we are trying to explain in order to explain their evolution. This is the problem with modern "particle physics" where new particles are invented to explain every force and so on. For every new particle you concoct to solve a problem of physics, you have just created a new problem of physics (what makes this particle evolve?)

There is one "force" that is implied within the principle of causality itself - namely, the tendency toward equilibrium. The tendency toward equilibrium is implied in causality because if we have a list of all the causes of phenomena, then there can be no other arbitrary tendencies away from equilibrium which are not on the list. Once all the laws of a system are known, the only "deviations" from the system can be "noise", which is simply the microscopic effects of the known laws that we have neglected to calculate for convenience.

Hence, every displacement of a causal system from equilibrium also implies its eventual return to equilibrium. In this, we see that the displacement from and return to equilibrium is analogous to spatial surplus/deficiency, as each is implied in the other. There is spatial positivity (surplus of bodies relative to equilibrium), spatial negativity (deficiency of bodies relative to equilibrium) and these evolve according to temporal positivity (displacement from equilibrium) and temporal negativity (return to equilibrium).

A final note on equilibrium is in order. Equilibrium cannot be an exactly uniform distribution because an exactly uniform distribution itself implies a causality (law) which has not been accounted for. Hence, equilibrium is necessarily a random, average uniformity. If it were not random, then there would be something yet left to explain, when all laws had supposedly been accounted for.

What are our senses? Assuming the Darwinian hypothesis of common descent, for the sake of argument, we can look to other biological organisms for clues. After all, humans are not unique in their capacity to sense the state of the physical world. Other animals can see, hear, smell, feel and taste.

The simplest organisms, single-celled organisms, also sense the state of the physical world. They do not “see” or “hear”, per se, but they do measure or sense the state of the physical world around them and react accordingly. Bacterial conjugation is mediated by the pilus, a small hair-like structure that enables a receptor bacterium (which has no pilus) to sense and attach to a donor bacterium (which has a pilus) so genetic material can be transferred. Once the transfer has occurred, the donor will use the genetic material to construct its own pilus, converting it into a donor bacterium.

At an even smaller scale, the many intra-cellular functions are performed by large molecules called polymers. The DNA replication process utilizes numerous polymers. Consider one polymer in particular, the restriction enzyme or restriction endonuclease. Its purpose is to cut the DNA strand at a specific point. The endonuclease performs this operation by chemically recognizing a specific DNA sequence (genetic code pattern) in the region that it is to perform the cut. Each distinct restriction enzyme exists for the purpose of cutting the DNA at one specific point and it is constructed in such a fashion that it chemically attaches only to the DNA and only at that point.

The restriction enzyme is a chemical which, in the language of computing theory, performs pattern recognition. It is really nothing more than a very large molecule, or macromolecule, yet it can discriminate between the billions of base pairs along the length of the DNA which it is not supposed to cut and attach to the DNA only at the points where the DNA sequence is a chemical match. The endonuclease, despite not even being a living thing, is capable of categorizing the physical world. That is, the endonuclease can discriminate between this and that.

Whatever else can be said about what sense perception is, its functional role is exactly that of discrimination. Sense perception enables us to discriminate friend from foe, male from female, predator from prey, and so forth.

Discrimination and Symbols

You are presently reading the words which I have written on this page. Written words are discrete symbols, meaning, they are easily distinguished from one another. Words are unlike liquid which is fluid and continuous. Once you pour a cup of water into a bowl filled with water, there is no way to continue distinguishing the water which was once in the cup from that which was already in the bowl. The water mixes together in a fluid, amorphous manner. On the other hand, pebbles, like those along the bottom of a creek, are easily distinguished from one another. If you have five pebbles in your hand, and I give you two more to hold in your hand and ask that you keep track of them so you can give back the very same pebbles to me, you will be able to do so. If I had asked you to do the same with a cup of water poured into a larger bowl of water, you would be unable to do so.

This difference is the distinction between the discrete and the continuous. The discrete consists of everything which can be easily distinguished. But distinguishing or discriminating presupposes a discriminator. Restriction endonuclease can discriminate between the portion of the DNA sequence it is responsible for cutting and all other DNA sequences but it cannot discriminate between red-colored and blue-colored objects because it is not a color-discriminator. The human retina, optic fiber, visual cortex and the other components of human vision do comprise a color-discriminator (within the range of human-visible light). With some fuzziness around the boundary (violets), humans can reliably distinguish red-colored objects from blue-colored objects.

The human brain is also capable of distinguishing auditory objects. The consonants and vowels of various languages can be reliably distinguished from one another by their speakers. This ability to discriminate between the discrete auditory objects of language is a crucial building block of human language. If the sounds of words were indistinguishable from one another – like water which, once mixed, is indistinguishable – communication through sound would not be possible.

Symbols are not abstractions even though abstraction is involved in the use of symbols. Symbols are physical objects. Consider a box filled with wood cutouts of letters of the English alphabet. We hand them to a young child who is just learning the alphabet and ask her to organize them however she likes. Depending on how well she has learned her alphabet, she may organize them by grain, color, weight, size, lightness/darkness, or geometric shape. If she organizes them by geometric shape, we will recognize that she has grasped the abstraction of letter. The “letter A” is an abstraction which exists independent of its physical instantiation. But it is important to keep in mind at all times that the abstraction of the letter A is not a symbol, it is an abstraction. The symbol A is only, ever a physical thing. It is the sound spoken or the letter written but, in all cases, it is a measurable, physical object. The ability to discriminate the symbol A from the symbol B is what makes language possible.

Calculation as Physical Prediction

Since symbols are physical, we can build artificial machines which recognize them and manipulate them. When you consider the restriction endonuclease, for example, we are merely following in Nature's footsteps. Early computer programs were entered into electronic computers with the use of punch cards or punched paper tape. The computing device had an input sensor which could discriminate between a punch and a non-punch and input a series of such symbols. The input device converted the symbols from mechanical to electrical form in the computer’s memory.

One of the earliest mechanical computers was Charles Babbage’s difference engine. It is a machine built out of metallic parts which can accept inputs through setting levers and then produce an output after cranking the machine through its cycle. Calculation can be defined as prediction of the long-run steady state of such a physical device.

Computation and Randomness

In 1936, Alan Turing published the foundational paper On computable numbers, with an application to the Entscheidungsproblem in which he presented a thought-experimental device which is today called the Turing Machine. One particular kind of machine which Turing described is the Universal Turing Machine (UTM). It is the mathematical formalization of the computer. Nowadays, we find the fact that machines can perform general problem-solving tasks unremarkable. But at the time Turing wrote, “computer” was an occupation, not a device. The unique insight of Turing was that it is possible to build a computing device which can be used as a general problem-solver, that is, to solve any given kind of mathematical problem.

Armed with such a device, the natural question to ask is whether we might be able to solve any mathematical problem whatever. Turing answered that question in the very same paper: no. I will sketch the argument he gave to prove it.

Every program that is executed on a Turing machine will either halt after some finite number of steps or will continue indefinitely without halting. Turing asked: is there a program that decides, for any given program, whether or not it will halt? Let us assume that there is such a program and let us name it HALT(x), where x is the program for which it is to be decided whether or not it halts. Further, we define HALT(x) such that when x halts, HALT(x) does not halt, and when x does not halt, HALT(x) halts. Finally, we pass HALT to itself and we ask what happens? For if HALT(HALT) halts, then it does not halt and if HALT(HALT) does not halt, then it halts. Thus, we have shown by contradiction that our initial assumption was false. There is no such program.

Turing's result has very general implications for mathematics since the Universal Turing Machine can model the behavior of any formal system. The mathematician Gregory Chaitin has extended Turing's work and has shown that the consequences of the fact that there are unprovable mathematical truths are staggeringly broad. Chaitin has shown the connection between Godel's famous 1931 incompleteness theorems and Turing's uncomputable problems. "There are mathematical facts that are true for no reason", says Chaitin.

Chaitin's most famous contribution to mathematics is his halting probability, Omega. The idea of Omega is to attempt to "solve" the halting problem by estimating the probability that a program halts rather than directly answering the question in each particular case. After applying a suitable "prefix" to each program so that it is "self-delimiting", Chaitin is able to apply a probability distribution over the set of all programs and then ask what is the probability that any program chosen at random from the set of all programs will halt? It turns out that the numerical value of this probability is itself uncomputable - it must be uncomputable, because we can use its numerical value to solve the halting problem in particular cases.

This is remarkable because we have a number that has a definite value - the halting probability exists - but finding its actual value is maximally computationally difficult and completely indistinguishable from a random number generated through physical methods, for example. That is, the halting probability possesses absolutely no mathematical properties at all.

We can construct a physical device that will compute the digits of Omega in worst-case time^{1} and we can add as many extensions to this device as desired (more memory banks). We cannot, however, state what the long-run behavior of such a device will be.

Truth and Proof

"What is truth?", Pilate asked Saint Paul. Philosophers have struggled with this question for millenia. Many answers have been given but important properties that have been consistently identified is that the truth must be self-consistent and unambiguous.

We can imagine a brute-force approach to settling the question of what is true, once and for all. Let us write out every syntactically correct English sentence and then ask whether it is true or false. But among these sentences will be one that states the Liar Paradox and puts any propositional concept of truth to the test.

"This sentence is false." Well, is it? If we say it is false, then it is true. If we say it is not false, then it is false.

So, there are propositional sentences that are neither true nor false. It is fruitless to troubleshoot the issue by abolishing self-reference. We can construct mutually-referential sentences with precisely the same effect and it turns out that there is nothing bizarre, unnatural or meaningless about sentences that refer to other sentences, including themselves.

In the face of this problem, we might try a more modest approach. Perhaps the problem is that natural language is vague. Instead of enumerating English language sentences, we restrict ourselves to statements in a fully formalized mathematical language, such as logic, and rather than asking whether the proposition is true or false, we restrict ourselves to asking whether it is provable or not. But it turns out that we can construct a version of the liar paradox in any suitably powerful formal language, as well. Namely, we can construct a true proposition that says "This sentence is unprovable" and we are right back where we started. In every formal system, there are well-formed, true propositions that cannot be proved. This is Godel's first incompleteness theorem.

This is problematic because the most interesting proposition we might like to prove is the proposition that states that our formal system is consistent. If we could prove that a formal system is consistent, then we could go to sleep and rest well, knowing that a contradiction will never arise in our mathematics. This can be done for first-order logic, for example. Here again, Godel disappoints us. A formal system is consistent precisely when it cannot prove itself consistent. This is his second incompleteness theorem. This means we can never rest in the full knowledge that our formal system has no hidden contradictions unless we have already proved it consistent in some other, more general formal system but this more general formal system itself now has precisely the same problem we started with - it cannot prove itself consistent, provided that it is consistent.

The Limits of Proof

From Godel's work, we can easily show that it is not possible to build a generalized proof generator or any machine for discovering truth. Mathematician Rudy Rucker gives an approachable argument to show why this is the case in his book, Infinity and the Mind. Imagine that there exists a Truth Machine that, when given a book, will decide whether the book is true or false. Now, imagine a book that contains the blueprint and specification of the Truth Machine itself and tacked onto the blueprint and specification the claim, "such a machine will never say that this book is true." If the machine attempts to say the book is true, it would be false, and vice-versa.

The Limits of Knowledge

The work of Godel, Turing and Chaitin have three important consequences to epistemology:

1) Mathematicians cannot design and build a self-obsoleting mechanical mathematician. The human mathematician is ineradicable.

2) We cannot construct a mathematical theory of everything

3) We cannot construct a physical theory of everything

Each point follows from the preceding. Because we cannot build a self-obsoleting mechanical mathematician (a theory that is more powerful than our own ability to comprehend), we cannot construct a mathematical theory of everything.

Because we cannot construct a mathematical theory of everything, because symbols are physical, because calculation can be defined as predicting the long-run state of a mechanical device, and because we can build a physical device to compute the digits of Omega whose long-run behavior cannot be predicted, we cannot construct a physical theory of everything.

Clayton -

^{1}This can be a bit confusing; When mathematicians say that something is "uncomputable", they are actually making a statement about time-bounds... the time required to compute the answer is slower than any computable function. The idea is that the only way to see the solution to an uncomputable problem is to run it on a Turing machine and see what happens... there are no shortcuts, no way to save time and you're never sure that you're done no matter how long you wait.

@Andris: Hmm, I'm not sure. I've taken a gander at Mises's chapter on probability and I have to say I think it's his least impressive work.

There are multiple ways to think about probability. The "naive, common sense" view of probability is, I think, what Mises is trying to outline... something along the lines that "this is what probability means as a category of human action." I'm not interested in that debate, so I won't remark on it.

In the Bayesian view, there are two kinds of probability, both equally meaningful and useful. The first is a priori probability. I believe this is close to the classical conception of probability - the analytical probability of something as determined on the basis of its geometry or its abstract configuration. The probability of rolling one side of a die or choosing a winning Powerball number are examples of this kind of probability. The second is the a posteriori probability which is I believe equivalent to the probability that frequentists accept - it is the "data-based probability"... the ratio of successes to attempts.

As far as so-called subjective probabilities, these are also meaningful but they don't mean what a lot of people probably think they do. For example, consider "the probability that Barack Obama will be elected President in 2012" prior to the election... something to which most people would venture a guess. Bettors hazard their money on it which suggests that the sentence must have a definite meaning.

Now, the a priori probability is untenable - no one can construct the analytical model of all the human brains that will be involved in the election and their true behavior on Nov. 6th. And neither is the a posteriori probability any help... the election is a one-off event. Yet people still bet on the election... how is this possible? The argument that Mises gives (rules out a priori, then rules out a posteriori and then concludes that there is no probability for the outcome of an election), neglects the information-theoretic view that bettors are not looking at the a posteriori probability of the election itself but of the causes of the election-outcome and inferring the probability of an election-outcome on the basis of the conditional probability given the available data regarding the causes of a particular election-outcome.

What does this mean? Well, the cause of an election outcome is people casting votes for that outcome. One of the correlates of people casting votes for an outcome is people talking about casting votes for an outcome or holding signs advertising that outcome, and so on. If you had asked, "What is the probability that Beauregard Finkelstein will be elected President in 2012", you could quite confidently have given the meaningful answer "roughly zero" on the basis of the fact that there were no people talking about casting votes for Finkelstein, that there were no people holding signs advertising Finkelstein, that Finkelstein's name was not present on any ballots, indeed, that there is no evidence that Finkelstein even exists.

But this is a modern view of probability which had not yet emerged when Mises wrote. It turns out that information and probability are intimately linked and that we can even incorporate notions of causality into probability as long as we are very careful to make sure all our probability distributions map to the range 0 to 1.

What do you mean by the term "probability"? For example, how would you interpret the statement, "There is a 50% probability that this coin will come up heads"?

What do you mean by the term "probability"? For example, how would you interpret the statement, "There is a 50% probability that this coin will come up heads"?

It's context-dependent. As I said, there are at least two valid kinds of probability (a priori and a posteriori) and, with care, we can construct from these a third, more subtle form, which is the Bayesian probability (which has links to information theory and even to causal-reasoning).

The definition of the a priori probability of the outcome of a coin-flip is as follows: There are two faces of a coin, one of which must come up if the coin is flipped, and neither side is favored by the coin's geometry, thus, the probability of a particular face coming up is 1 / 2 (possible outcomes). Note that this is a definition, not a proof or argument or an attempt to model the physics of a real coin-flip.

The definition of the a posteriori probability of the outcome of a coin flip is as follows: Flip it many times and count the number of times one side comes up and take the ratio of this to the total number of flips.

I mean for the purpose of interpreting the term "Omega." A posteriori: probability is defined as the ratio of heads to tails in a number of coin flips performed. A priori: where's the definition? But feel free to skip this; I'm just curious what definition of the term is used in defining the term Omega.

@AJ: OK, I'm not entirely sure what you're asking but I'll just explain how the probability Omega is defined.

Consider a Universal Turing Machine (UTM). It has an "input tape" which is just a series of 1's and 0's. We can imagine enumerating every possible input to the UTM machine as follows:

Now, the hard part comes when you want to say, "Out of the set of all possible programs, choose one at random." Isn't the probability of a particular element of an infinite set precisely zero? Or, if you choose some other way to calculate the probability (such as counting the ones), you can end up with probability spaces all of whose elements sum to greater than 1.0 or even to infinity!

This is the first problem that Chaitin solved. He said, what if you attach a prefix to each program such that the prefix specifies the length of the program? Then you can simply assign a "weight" or "probability" to each such program based on the length of its prefix and we know from the properties of a prefix-free code (more commonly referred to as a prefix code) that any such code can be mapped to the probability space 0 to 1.

One such coding that is commonly used in algorithmic information theory is:

1^{L(x)}0x

Where L(x) means "length of x in bits". So, to encode the binary pattern 1101101:

For x = 1101101, L(1101101) = 7, so:

1^{7}01101101 = 111111101101101

The probability of every binary sequence can then be defined as 2^{-L(x)}. Thus, the probability of 1101101 is 2^{-7} = 1/128. If you take the sum of probabilities over all such codes, you will get 1.0 because Sum (k=1 to inf) 2^{-k} = 1.0

We can also assign a physical meaning to this in terms of coin flips. Imagine a device:

Start: Counter=0; Flip a coin
1: If tails, go back to Start, else increment Counter and continue
2: Flip a coin; if heads, increment Counter and goto 2, else continue
3: Do the following Counter times: Flip a coin and report the result to the user

So, you can see that the probability of getting longer sequences is lower according to this configuration.

Now that we have some way to generate any given input program to our UTM, now it's just a matter of asking "given a randomly chosen input to the UTM, what is the probability that it will halt?" That is Omega.

Since Omega is defined as the probability it will halt, I'm simply wondering what definition of the word "probability" is being used within that definition (a priori, a posteriori, etc. - and if a priori what is the actual definition?). As Andris mentioned, Chaitin's conclusion seems to depend on probability being a non-subjective quantity, but Bayesians (at least Jaynes and Yudkowsky) say probability is subjective (which I assume means they define it in such a way that it is clearly a type of ignorance, etc.). So yeah, just a sentence as a definition of the term "probability" for the purpose of defining the word "Omega" is what I'm looking for.

@AJ: The ratio of halting programs to all programs - with the caveat that this is only meaningful if the set of all programs is structured as a prefix-code so some kind of measure can be placed on it. See the definition here. If every program halted, then the given sum would be 2^{-|p|} over every p, which equals 1.0. So this definition satisfies the condition that a probability must lie between 0 and 1. I will direct your attention to this article for perhaps a more readable presentation of the argument against a mathematical ToE by Chaitin himself. Here's another.

Oh I see, thanks. What do you think about Wildberger's rejection of real numbers given Omega is supposed to be uncomputable? (Not to mention the infinities.)

Well, Omega is just a particular, incompressible real number. With probability 1, any real number is incompressible (maximally random), so Omega is not remarkable in that sense. What Omega helps explain is why real numbers in general don't make sense... all but countably infinitely many real numbers (that is to say, with probability 1.0) are incompressible like Omega. The square-root of 2 or pi are counter-examples... real numbers that are not incompressible. But there are very few of these numbers. The rest are like Omega. So, what exactly do we need all these numbers that cannot - even in principle - be named or specified or picked out of a set? Somewhere Chaitin says "I do not think I believe in real numbers."

If you want my opinion (and it is my opinion), two centuries from now, people will look back at the real number system like we look back at Roman numerals - how could anyone even do math at all with such a horrible numbering system??? I believe that the mischief comes from the fact that we have modeled our concepts of distance (measure) on the peculiar way we have constructed the decimal place-value system. The decimal place-value system was revolutionary and opened new vistas by comparison to the older systems that went before. But I'm of the opinion that we need to shift to thinking about math p-adically. This is not a question of the facts of math... those will remain the same either way. Rather, it is a question of how mathematicians are trained and the analytical structures within which human brains are thinking about mathematical problems. Our thoughts are the product in large part of the language we use. This is as true of mathematics as in the general case.

Oh my god, is there no end to anti-Establishment, guerilla philosophy and mathematics?? I just discovered Tau... the new alternative to pi, as in, the pi that you learned in school, the ratio of diameter to circumference. The Elites have been lying to us for thousands of years!

This is kind of a late response but I noticed you guys were talking about two's complement.

I wanted to note that the two's complement representation of -1, for example the 8-bit quantity 11111111b, can be extended infinitely to the left so that it is arguably a mathematically acceptable representation of -1, and not just a convenient computer trick:

Let s = ...11111xb = 1 + 2 + 4 + 8 + 16 + ...

Then

2s = 2 + 4 + 8 + 16 + ... = s - 1

After subtracting s from each side you can conclude that s = -1.

where he computes 1-1+2-6+24-120+720-... (alternating factorials) and ends up with the Gompertz constant (0.596...). IIRC he computes the sum six different ways - like through differential equations, integration, continued fractions, series acceleration, or computing the logarithm of the sum - getting the same result each time.

Euler also computed a number of sums such as

1 + 2 + 3 + 4 + 5 +... = -1/12 and

1^3 + 2^3 + 3^3 + ... = 1/120

1^5 + 2^5 + 3^5 + ... = -1/252

which led to his conjecture of the functional equation (reflection formula) of the so-called Riemann zeta function

There are also interesting products like

1 x 2 x 3 x 4 x... = sqrt(2pi) (If I remember correctly... I can derive it if you wish)

Simply admitting that negative numbers can in a sense be larger than infinity leads to techniques that are a lot more fun, and practically more useful, then the pedantic modern equivalents ("Borel summation").

The p-adic valuation leads also to the same conclusions, at least in respect to the summations. It is clear to me that p-adic numbers are in some sense more "fundamental" than the real numbers.

In fact, I see a connection to information theory in the p-adic numbers. The negative of the log (base p) of the valuation of a p-adic number is also the number of "digits" in its p-adic representation. This suggests that the geometry of the p-adic numbers is the geometry of causality*.

Further illumination comes from investigating the finite truncated -adics, as in 2-adic used in computers today (two's-complement binary). If you look at the 3-bit two's complement numbers:

Adding these numbers together forms a modulo "ring"**. You can envision a circle with each of the eight binary values arranged around it... and addition/subtraction (which is really just addition) is just the direction you are going around the circle. Extending to ever larger bit-widths, we can see that the radius of the circle is increasing, or the density of the numbers around the circle is increasing, or both, however you choose to think about it. In the limit then, it is a "circle with infinite radius", which, by the way, is a lot like this.

Euler's sums and products are definitely among the most remarkable facts in mathematics and I think they are telling us something very fundamental about the nature of numbers.

A project I work on in my spare time at times: Is there a way to encode complex numbers that is as natural as the p-adic encoding of negative and fractional numbers?

Clayton -

* Information theory is linked to causality through the idea of dependent-variables... if you make a copy of classical information, this is a lot like quantum "entanglement" in that the copied information and the original information are not independent variables and, thus, their information content is no longer simply additive.

**I'm using ring in the non-mathematical sense of the word... a circle

>Clayton: It is clear to me that p-adic numbers are in some sense more "fundamental" than the real numbers.

Hmmm... I've never really thought about it before, but real numbers do bother me since any given real number may require an infinite amount of information to define. In fact, this is true for all but an infinitesimal proportion of the real numbers.

Transcendental numbers that appear in practice can be defined by patterns like

I've never seen someone deal with a particular number that takes an infinite amount of information to represent.

Here's an example of one: Chaitin's constant. But as Chaitin himself constantly likes to point out, this number proves way too much about the real numbers, that is, it shows that the idea of "the real number set" simply doesn't make sense. No finite axiomatic system can possibly deal with such a set.

I disagree Clayton. Chaitin's constant only requires a finite amount of information to define. Otherwise a definition couldn't even be given. It is not like most real numbers which require an infinite amount of information to specify (e.g. an endless Cauchy sequence of arbitrary numbers).

@baxter: Think again. While Omega has a definite value, even its definition (algorithm to compute it) requires infinite information. It is irreducible. And this is, in fact, the case for all reals but an infinitesimal fraction of them.

And you might object that Omega might be really hard to compute, but perhaps it's not so hard to check (ala NP problems). But this is not the case... to check a candidate computation of Omega is just as hard as to compute Omega to begin with. So, while it is the case that the halting probability is the sum that Chaitin provides, that sum is merely formal, it could be applied to the probability of any language (Omega can be thought of as the weighted probability of the language HALTS), so it is not really a definition.

It is not like most real numbers which require an infinite amount of information to specify

This is precisely the "property" (or lack of one) that Omega was constructed to have! It is a very odd number in that we can say there is some one, definite Omega (relative to a given computer), but at the same time, it's in a kind of fuzzy superposition where the values of its bits are truly random, a "property" it shares with almost all other real numbers.

>While Omega has a definite value, even its definition (algorithm to compute it) requires infinite information. It is irreducible.

I still don't see things this way.

Omega is reducible: the infinite digits of omega can be conceptually compressed back into the very definition whence it came.

There is a trivial algorithm to approximate Omega that gets closer and closer the longer you run it. Each time a binary digit of Omega is determined, the set of possible values in which Omega is known to lie has its measure cut in half. Omega is only said to be "not computable" in the sense that the digits don't arrive in a certain, pleasant order.

I think my main problem is that I feel some commonplace words are being abused in formal mathematics to make things sound mystical.

Omega is reducible: the infinite digits of omega can be conceptually compressed back into the very definition whence it came.

There is a trivial algorithm to approximate Omega that gets closer and closer the longer you run it. Each time a binary digit of Omega is determined, the set of possible values in which Omega is known to lie has its measure cut in half. Omega is only said to be "not computable" in the sense that the digits don't arrive in a certain, pleasant order.

I think my main problem is that I feel some commonplace words are being abused in formal mathematics to make things sound mystical.

Nope, the words are being used to mean exactly what they mean.

"X is Uncomputable" means: no algorithm* computes X in computable time. Of course, it is trivial to write a definition that is "Search all possibilities and HALT when you've found the solution." In thise sense, we can compute the Riemann Hypothesis or any unsolved mathematical problem. However, the heart of the issue is "how long would it take to search all possibilities and find the solution, and is there a faster (computable) way?" If you can prove that any algorithm that would solve a problem (given unlimited time) solves it more slowly than any computable function, you have proved that the problem is uncomputable.

The only "stronger" sense of uncomputable would be a contradiction (completely impossible).

Any finite algorithm that produces the bits of Omega must produce them more slowly than any computable function. Stated differently, it takes as long to solve the busy-beaver problem as it does to produce the bits of Omega.

Clayton -

*By definition, an algorithm must be of finite size