### Inductive Inference and Solomonoff Induction

#### by jseldred

**About me:** My name is Jeffrey Eldred and (right now) I am a physics PhD student conducting research at Fermilab in accelerator physics. I’ve always had an interest in philosophical topics and for the last five years I’ve written for a Q&A site AllExperts: Atheism.

**Presentation on Solomonoff Induction:** I presented on Inductive Inference and Solomonoff Induction for the fledgling Fermi Philosophical Society yesterday (April 10th 2014). A video of my presentation can be found here and the slides I’m using can be found here. The talk was well-received and sparked many interesting discussions that may be the subject of future talks.

**References:** I’ve put together a list of references to explore the mathematics and philosophy presented in the talk more deeply.

This is a description of Solomonoff Induction for a general audience prepared by the applied philosophy website LessWrong. It was something of an inspiration for this talk and (by necessity) it matches the topics covered in my talk quite well. They also have a wiki which writes out the mathematical result from Solomonoff Induction. I would also recommend a talk by Eliezer Yudkowsky (founder of Less Wrong) which explains how parsimony relates to precise analytical thinking and how the humans are naturally biased towards the unparsimonious.

If you’d like to go straight to the source, one can read Solomonoff original writing as he presented Solomonoff Induction for the first time: Part I and Part II.

Bayes Theorem was an important topic covered in my talk. Bayes Theorem is useful but unintuitive. Here is the landmark psychological study showing a failure in Bayesian reasoning among physicians. Richard Carrier, a Biblical and statistical scholar, has been an outspoken advocate of atheism as well as the application of Bayes Theorem to historical scholarship. This is a talk by Richard Carrier – the first half focuses on Bayes Theorem and how to use it; the second half uses a Bayesian argument against the existence of God (and it particular, special pleading).

If I had more time in my talk, I would have covered Edwin Jaynes’ contribution to probability in greater depth. Its in one of my presentation’s backup slides. But I highly recommend reading Jaynes’ original *Information Theory and Statistical Mechanics*. It explores the relationship between the principle of indifference and identical microstates in what has come to be known as the Maximal Entropy Principle. This important work gives insight into both fields of information theory and statistical mechanics.

Principle of Indifference was also a key part of the talk. I think Wikipedia’s article on Bertrand’s Paradox (and how to resolve it) is an excellent case study how the principle of indifference might be extended into continuous geometric space. I also have a backup slide about this.

Laplace was the one who really popularized both Bayes Theorem and the Principle of Indifference. You can read a free full translation of his *Philosophical Essay on Probabilities* here, but there is also a more modern translation.

Finally I will leave you with a list of fulltext links to other philosophical works references in this talk:

David Hume’s *Dialogue’s Concerning Natural Religion*:

Introduction to Epistemological Pluralism.

Stephen Jay Gould’s *Nonoverlapping Magisteria*.

David Hume’s *Enquiry Concerning Human Understanding*:

Ludwig Wittgenstein’s *Philosophical Investigations*.

Douglas Hofstadter’s *Godel, Escher, Bach: Eternal Golden Braid*

Many thanks, Jeff, for your so educative, well-prepared, and inspirational talk, and for this extended list of materials as well! Some of my questions and remarks will come soon.

I have two remarks to your talk, Jeff.

The first one relates to your statement that “God is not required for Solomonoff induction”. At first glance, Solomonoff induction, as it is presented in your talk, does not require God. However, I do not see how you could justify the faith of fundamental science, that the fundamental laws of nature are expressed by beautiful equations, as P. Dirac put it. Without God, what causes the aesthetics and relatively simple cognizable form of the fundamental laws? This question has two dimensions: one of belief and another of result.

My second remark is about values. How could you measure the truth and power of values? What makes certain values so powerful? How could you express a role of certain values in the scientific cognition?

1) I would agree that this would be the way to insert God back into the foundations of Occam’s Razor. Perhaps I should have gone into uniformity of nature more in depth or removed that line from the conclusion.

In any event, I do not see uniformity of nature as a difficult thing to explain. Lets call all things that change, matter. There must be something that describes the way these things change, lets call those rules. Even if no rules can deterministically explain the way the matter changes, certainly there would be the ability to characterize it as a probability distribution. There – a universe with changing matter and changeless rules uniformly governing the behavior. And can we imagine any different? Can we imagine a possible universe that is fundamentally unorganizable by principles of mathematics? I cannot imagine the counter-example, let alone find it to be more natural than a universe organizable with mathematics (like our own).

Neither am I certain how God helps explain uniformity of nature, for that matter. Its a popular idea, but if you series explained how this conclusion was justified I must have missed it.

2) In order for induction to be useful, it does presume one cares about truth. If one hopes to accomplish anything, I think one cares about truth for practical reasons, if nothing else. So I think my presentation only assumes that one is not a true nihilist. True nihilism, if not a logical impossibility is certainly a biological impossibility. So I don’t think it makes sense to present induction as any way contingent upon any particular value system.

As to what particular value system to adopt, clearly this was beyond the scope of the talk. There are many secular moral philosophies that can be discussed. Perhaps I will make a presentation (some day) on the moral ideas that I favor and how I justify them.

1)”In any event, I do not see uniformity of nature as a difficult thing to explain.”

The problem is to explain the simplicity of the laws of nature, their validity within huge range of parameters, as well as extremely high accuracy of these simple forms. I am not sure that it is the same as what you are calling “uniformity of nature”. Without God, I do not see how to explain this. Moreover, I would rather think that all mathematical forms would have equal rights, and then, applying the “insufficient reason” principle, we have to conclude that the laws of nature should be expected infinitely complex, incognizable.

2)Talking about the values, we have to distinguish between values for life and values for theoretical cognition. In my talk I stressed that the second are very different from the first. I hope you remember my arguments and multiple historical examples related to key scientific figures, so I will not repeat them here. What gives the ultimate rank and nobility to the “cosmic religious feeling”, which Einstein called as the true motivation for the theoretical cognition?

1) Your proposal that a natural way for the universe to proceed is to give all mathematical forms equal weight is not sensible. In order to give mathematical forms equal weight, one would have to count them. At what point do two forms count as the same and at what point do they count as different? Shall we divide mathematical forms into classes and subclasses, and give every class equal weight and every subclass equal weight within a class? This would all be nonsense. Mathematical forms apply following from their axiomatic foundations and so it would just be a question of how to weight the axioms.

But merely trade the language of mathematical forms and axioms for the equivalent language of algorithms and inputs and you see the answer has already been provided! Solomonoff Induction is how you give mathematical forms equal rights! And far from requiring an assumption about the naturalness of simplicity, it provides an explanation for the naturalness of simplicity. Far more insight, I might add, than is ever provided by merely attributing simplicity to God!

Now, as far as the values of the universe, I suppose that you mean to indicate some form of Fine-tuning Paradox. This is the sort of problem that would be a great application of Occam’s Razor and Bayes Theorem, although I do not expect we’d agree on the details.

What is the prior probability of God? I think it is a common mistake to consider him to be a simple object. If he is indeed intelligent enough to hold the entire universe in his mind, than he would clearly be the most complex entity ever put forth as a hypothesis. His prior probability could be unfathomably small. I would also regard his likelihood (P(B|A)) to be quite small as well. If he is a moralistic and powerful God, than the Evidential Problem of Evil makes a strong case for why we would not expect a universe like ours to follow from such a classical view of God. Even if we use whatever free will theodicy believers might pick and even if we grant God unseen excuses for the remainder of evil, our universe, so naturalistic and indifferent, does not resemble a typical universe one would expect from such a deity. Take as one example, just the vast quantity of wasted space – The volume of the Earth is about 10^-53 the size of the volume of the observable universe, or about the same ratio between the volume of a proton and the volume of Earth.

However difficult a secular scientific accounting of the universe might seem, every step towards a God-like hypothesis seems a step in the wrong direction. A step toward greater complexity and less plausibility. I think we are also far from knowing the limits of the explanatory power that can be found from a secular pursuit of scientific knowledge about the origin of the universe; in contrast, the explanatory power of a supernatural process seems to have peaked sometime prior to a modern understanding of science.

Hume’s Dialogues and Richard Carrier’s talk proceed in this fashion to demonstrate God to be out of place in a parsimonious view of the universe.

Jeff, I do not see how Solomonoff Induction (SI) may explain why the laws of nature are so short. SI, similar to Occam’s Razor, suggests to try a simpler form (algorithm) before trying more complicated, but it cannot tell you how complicated is the true law you are going to discover, it cannot even tell you that shorter laws are more probable to expect. Our laws of nature are very specific, they are short and elegant, and I do not see a reason to expect that sort of laws, unless they are purposefully chosen.

Also, I have to mention, that the idea of the Absolute Mind, as a source of the laws of nature, is not an idea of something “complicated”; on the contrary, it is something simple, since the totality of mind cannot be structured, there are no such things as parts of mind. The brain is complicated, but the mind is not. That is why the Absolute Mind is the perfect terminus to the question about the source of the laws of nature.

As to the values, so far I would refrain from the further arguments, postponing them to the future, when I will finish my “Faith of Fundamental Science” course.

Indeed, SI tells you how shorter laws are more probable explanations for a given observation data. It also tells you the observation data not yet observed is more likely to be something that can be expressed as a shorter law – short relative to the other laws consistent with the data not yet observed. In the regime where most or all the observation data is not yet observed, it still recommends the shortest law as more probable. If no data is observed, it recommends the shortest possible law. So actually I worry about the opposite problem more – Will we ever obtain laws simple enough to reasonably regard the result as natural? How would I go about explaining an apparent lack of simplicity in natural law?

I maintain minds are complicated. We’ve never observed a mind with no components and the more powerful a mind is the more complicated we observe it to be. There are many solid studies from psychology providing evidence for modularity of the mind. I speak components of the mind, although of course the brain is modular, but the mind can be considered an amalgamation of independent components. Take, for example, case of brain damage that removes my ability to form new memories but otherwise does not affect by mind. Here is a mind with a slightly different set of properties, therefore we consider the missing feature of the brain a part. No part of this case is underminded by the fact that brain damage is the source of the example – I am speaking about features of the mind that can be added or subtracted independently. This is the definition of component is one used, for example, by PCA/ICA algorithms: We can consider this feature of the brain in a manner distinct from other features of the brain, therefore it is a component.

I reiterate that we’ve never seen a componentless mind. We also never seen a componentless algorithm, a componentless computer, a componentless neural network, a componentless cell, a componentless mathematical proof, a componentless image, a componentless thought, a componentless language, a componentless word, a componentless personality or a componentless character. We’ve also never seen a componentless memory, a componentless data-storage system, a componentless encoding, a componentless record, a componentless religion, a componentless story or a componentless history. So if a componentless mind is even possible, which has not been demonstrated, it is not an object fit to compare to minds. To use the word “mind” to describe it is to mislead ourselves into thinking we can even understand that it is properly classified as a mind. Similarly every depiction of God doing anything, saying anything, thinking anything, judging anything, or having discernable properties is not a componentless being. Therefore to call God “God”, and call to mind a tradition of ancient superstition and ritual, is equally misleading. And imagine we encode such a componentless Thing into binary- Shall we call it 1 or 0?

Jeff, without any assumption about a source of the laws, it is impossible to make any statement about their expected length. I think this is an obvious statement, which has nothing to do with SI, and cannot be changed by SI. If I have no idea, what is their source, I have no reason to expect the laws to be elegant or even discoverable. Their elegance was expected by the fathers of science on a very specific ground, namely, on the ground of Pythagorean faith, shared practically by all of them, from Pythagoras to main figures of the scientific revolution of the last century.

As to the simplicity, you are providing a list of entities which are not simple indeed: they all have parts, and on top of that all of them are something specific. We may ask about this or that specific algorithm, brain, etc. The absolute mind, or mind as such, is not anything particular, it cannot be this or that; it is a totality of thought, thus it is thinkable as a terminus for the questioning about the source of the laws of nature. Compare, for instance, Wittgenstein’s:

“Jeff, without any assumption about a source of the laws, it is impossible to make any statement about their expected length. I think this is an obvious statement, which has nothing to do with SI, and cannot be changed by SI.”

I find that statements taken to be self-evident are often those which are the least justified, and we should not expect nature to always confirm our intuitions. SI is the result that one gets when one assumes the least – it is the neutral, most conservative interpretation of the available data. Anything conclusion, is what would require an assumption about the source of the laws to be a reasonable inference.

However, I see we are talking in circles in this one – I insist my claims have already been well supported and explained by the presentation, you insist there are obvious reasons it cannot be so. I don’t know that we can progress any further on this.

“The absolute mind, or mind as such, is not anything particular, it cannot be this or that; it is a totality of thought, thus it is thinkable as a terminus for the questioning about the source of the laws of nature.”

Let’s take the version of God that you have just described. Such a God has parts. Recall, these are logical parts not necessarily physical parts. “The totality of thought” can be deconstructed as an operation known as “Totality-of” applied to an object or set of objects “Thought”. Totality(Thought). That’s at least two parts. There are more parts, still, required to explain what is meant by “Totality-of” and what is meant by “Thought”.

You may ask where it ends. Can I call any concept exceedingly complicated by asking for an unending list of definitions? The answer is no. At some point I should have defined enough that (in principle) one can replace the words with unfamiliar symbols and still understand the mechanics. If one cannot do this, than it means that there is some critical part of the definition that I am withholding from the explicit statement of the theory. Hofstadter’s GEB is an excellent reference on this “typographical” approach to meaning. I think of also Wittgenstein’s meaning through demonstration.

I insist that such a God is the most complicated entity imaginable – you might do better to embrace this. A better candidate for the “terminus for the questioning about the source of the laws of nature.” would not be as an input, but as the universal Turing Machine used in Solomonoff Induction. The choice of universal Turing Machine is left unfixed, but perhaps one could imagine God as some unique choice of Turing machine or serving a role equivalent to the multitude of universal Turing Machines. This was actually the argument I was expecting. I would still insist that such a Thing is not properly referred to as God and may be actually untestable, but it would be a place for such a classical view in a modern framework.

Jeff, let’s model a scientific situation and imagine that you have some sort of data represented by a big array of apparently random digits. Are they actually random or not you do not know. There are several important questions here, and I do not understand your position in respect to them.

Do we have any reason to believe that there is any law which describes these numbers?

If yes, do we have any additional reason to believe that we are able to find this law?

If yes, do we have a powerful enough value for the hard work for finding this law?

What is the relation of the Solomonoff Induction to these problems?

1) Yes. It would be impossible for there not to be a law that describe these numbers and in fact we expect an infinite number of them. A classical philosophy argument might invoke Liebniz’s principle of sufficient reason, but its not necessary. A law can always picked post-hoc to describe numbers, even if the law is just a list of numbers.

2) Depends on what you mean by “this law”. If this law is any law that describes the numbers, then its trivial. If “this law” is instead the “true law” the law that actually generated the numbers, then the question becomes more complicated.

How could we ever know if we had the “true law” our of the infinite set of candidate laws? The “true law” would be one that is not only consistent with these numbers, but would also survive a more constraining set of numbers. For example we can use a future-based evaluation where we would imagine generating more numbers by going forward in time, space, or some other parameter. Or we could use a subjunctive-based evaluation where we would imagine changing some subset of the numbers and a looking at the implications for the remainder of the numbers. But knowing for 100% certainty the “true law” (actually a truer subset of laws) by definition requires knowledge that we don’t have.

So when discussing our knowledge of the “true law” (or laws), we must resort to statistical argument. But just because the claims we can make about the “true laws” are statistical does not mean all such claims are equal valid. In the parlance of continuous statistics, we can consider such criteria as a best estimator (minimum-variance unbiased estimator) or the maximal-entropy partition function. What it means for a statistical claim to be “right” is not that it always makes the right prediction, but that it has the uniquely best chance for the making the right prediction. If “finding” the true law means guessing, doing so would be unwise from a statistical perspective. Instead our criteria should be to find the unique most accurate assessment of the probabilities that various candidates are in fact the true law.

So to answer your question about the finding the “true law”, we cannot have certain knowledge of the “true law”. There can be certain knowledge of the uniquely most accurate assessment of the probabilities. This is what Solomonoff Induction is. The order of descending probability always coincides with the order of ascending complexity. This is what Occam’s Razor is. But Solomonoff Induction is needed to establish the probabilities.

3) I don’t know what you mean as a “powerful enough value for the hard work”. I believe it can be show that for a finite number of bits and a finite level of precision, the most accurate assessment of the probabilities (within precision) can be found in a finite amount of time (or computational steps). This is the algorithmization of Solomonoff Induction. Because this is part is outside of my expertise and it is an active area of research, I cannot personally validate the claim “finite bits, finite precision, finite time”. I think it is fair to say this is the belief of some experts in the field, they have reasons for thinking that, and there are no known counter-examples.

Jeff, I think we are inside a problem of inconsistent languages; I should be closer to your talk. I am trying that now.

At your page 17 (“Solomonoff Induction”) you are presenting a formula with 2^-L(s), commented as “For random input, the likelihood of input s decreases exponentially with its length L”. I wish to discuss this formula, to see all its assumptions. To clarify that, could you please tell:

1.What does it mean “the likelihood”? Does it mean you will continue receiving your data and checking your hypothesis with some more data?

2.Does this formula assume having at least two inputs compatible with the given output?

3.What does it mean for the input to be “random” in this context? What is the related ensemble and the procedure to pick the specific input from that?

So far I’d rather stop with these questions.