Solomonoff induction
Solomonoff induction is an idealized method of sequence prediction.
Variants
There are many variants of Solomonoff induction that seem to all turn out essentially the same, but most resources on Solomonoff induction only talk about one or two of the variants, which makes it pretty confusing for someone trying to learn about it. It's also not obvious that all of these variants are essentially the same, so showing that would be nice.
Comparison dimensions:
- Determinism: are your programs probabilistic or deterministic? If deterministic, I think the programs require random coinflips as input.
- Prediction length: finite vs infinite sequences
- Input length: finite vs infinite
- Solomonoff prior vs universal mixture
- Model selection vs sequence prediction?
- Discrete vs continuous? (is this the same as finite vs infinite?)
List of variants considered in various resources:
- Solomonoff's original paper:
- Eliezer's Solomonoff induction dialogue:[1]
- Li and Vitanyi:[2]
- (defined in theorem 4.3.1 and example 4.3.3) seems like a discrete universal mixture. Input is finite. The "programs" (lower semicomputable discrete semimeasures ) are probabilistic.
- LessWrong Wiki:
- Scholarpedia article:[3]
- : discrete, deterministic UTM (with random input), finite output, finite input
- Shane Legg's introduction:
- Sterkenburg's "The Foundations of Solomonoff Prediction":
- Marcus Hutter's papers...
- Herrmann's "An Introduction to Solomonoff Induction With an Application to Scientific Confirmation Theory":
Type of hypothesis
What is the type of a hypothesis? The type varies by formalization. For example:
- In classical statistical inference, a hypothesis is a constant that happens to be unknown, so e.g. it is a real number. This hypothesis then induces a probability distribution (called a "model").
- In Bayesian statistical inference, a hypothesis is an event (i.e. a subset of the sample space). There is a random variable representing the hypothesis (e.g. ), and you can get each individual hypothesis (e.g. ) by setting it to different values (e.g. ).
- In Solomonoff induction a hypothesis is a program (?) whose type depends on the variant of Solomonoff induction being used.
Significance of random coin flips
[5] (p. 4)
"""The prior AIXI is based on is the Solomonoff prior. It is defined as the output of a universal Turing machine (UTM) whose inputs are coin-flips.
In other words, feed a UTM a random program. Normally, you’d think of a UTM as only being able to simulate deterministic machines. Here, however, the initial inputs can instruct the UTM to use the rest of the infinite input tape as a source of randomness to simulate a stochastic Turing machine.
Combining this with the previous idea about viewing Bayesian learning as a way of allocating “trust” to “experts” which meets a bounded loss condition, we can see the Solomonoff prior as a kind of ideal machine learning algorithm which can learn to act like any algorithm you might come up with, no matter how clever."""[6]
Measures vs semimeasures
Turing machines, prefix machines, monotone machines
Kraft's inequality
Applications
- there are some standard applications you can find standard references
- carl's analogy to psychophysical laws
- malign prior stuff
References
- ↑ "Solomonoff induction: Intro Dialogue (Math 2)". Arbital.
- ↑ An Introduction to Kolmogorov Complexity and Its Applications (3rd ed).
- ↑ Marcus Hutter; Shane Legg; Paul M.B. Vitanyi. "Algorithmic probability". 2007.
- ↑ Marcus Hutter. "On Universal Prediction and Bayesian Confirmation".
- ↑ Ian Wood; Peter Sunehag; Marcus Hutter. "(Non-)Equivalence of Universal Priors".
- ↑ https://intelligence.org/2018/11/02/embedded-models/