Variants of Solomonoff induction: Difference between revisions

Revision as of 03:44, 31 March 2019

This page lists some variants of Solomonoff induction.

For determinism, I think "deterministic" is the same as "Solomonoff prior" and "stochastic" is the same as "universal mixture".

For discrete vs continuous, I think this just means whether the prior we define is over finite strings or over infinite sequences (where we want to know the probability of an infinite sequence starting with a given finite string).

Source	Formula	Determinism	Type of machine used	Discrete vs continuous	Notes
LessWrong Wiki^[1]	$m (y_{0}) = \sum_{p \in P : U (p) = y_{0}} 2^{- ℓ (p)}$ where $P$ is the set of self-delimiting programs	Deterministic	Page doesn't say, but uses self-delimiting programs and it's discrete, so prefix Turing machine?	Discrete because of the $U (p) = y_{0}$ rather than $U (p) = y_{0} *$
Scholarpedia discrete universal a priori probability^[2]	$m (x) = \sum_{p : U (p) = x} 2^{- ℓ (p)}$ where the sum is over halting programs	deterministic?	prefix Turing machine	discrete
Scholarpedia continuous universal a priori probability^[2]	$M (x) = \sum_{p : U (p) = x *} 2^{- ℓ (p)}$ where the sum is over minimal programs	deterministic?	Monotone Turing machine	Continuous
Sterkenburg (p. 22)^[3]	$P_{I} (σ) = lim_{n \to \infty} \frac{\| T_{σ, n} \|}{\| T_{n} \|}$ where $σ$ is a finite string, $T_{n}$ is the set of all halting (valid) inputs of length $n$ to the reference machine $U$ , $T_{σ, n}$ is the set of all halting (valid) inputs of length $n$ that output something starting with $σ$	deterministic	universal Turing machine (no restrictions on prefix-free-ness)	discrete?
Sterkenburg (p. 24)^[3]	$P'_{I I} (σ) = 2^{- \| τ_{m i n} \|}$ where $τ_{m i n}$ is the shortest program $τ$ such that $U (τ) = σ$ (i.e. the shortest program that causes the reference machine to output $σ$ and halt)	deterministic	universal Turing machine, universal prefix machine (to get a probability distribution; see remark on p. 27)	discrete?	this formula does not define a probability distribution over strings $σ$ because the sum of probabilities does not converge
Sterkenburg (p. 25)^[3]	$P^{'}'_{I I} (σ) = lim_{n \to \infty} \sum_{τ \in T_{σ, n}} 2^{- \| τ \|}$ where $T_{σ, n}$ is the set of all programs $τ$ of length $n$ such that $U (τ)$ begins with $σ$	deterministic	universal Turing machine		$P^{'}'_{I I} (σ)$ is divergent even for a single $σ$ , so this is not actually a workable version, but is intended as a stepping stone
Sterkenburg (p. 26)^[3]	$P_{I I} (σ) = lim_{ϵ \to 0} lim_{n \to \infty} \sum_{τ \in T_{σ, n}} {(\frac{1 - ϵ}{2})}^{\| τ \|}$	deterministic	universal Turing machine, universal prefix machine (to get a probability distribution; see remark on p. 27)		The use of the $ϵ$ is a hack to get the sum to converge
Sterkenburg (p. 29)^[3]	$Q_{U} (σ) = \sum_{τ \in T_{σ}} 2^{- \| τ \|}$ where $T_{σ}$ is the set of minimal descriptions of $σ$ (i.e. set of programs that output something starting with $σ$ such that if one removes one bit from the end of the program, it no longer outputs something starting with $σ$ )	deterministic	universal monotone machine	continuous?

References

↑ https://wiki.lesswrong.com/wiki/Solomonoff_induction
↑ ^2.0 ^2.1 Marcus Hutter; Shane Legg; Paul M.B. Vitanyi. "Algorithmic probability". Scholarpedia. 2007.
↑ ^3.0 ^3.1 ^3.2 ^3.3 ^3.4 Tom Florian Sterkenburg. "The Foundations of Solomonoff Prediction". February 2013.

[1] ttps://wiki.lesswrong.com/wiki/Solomonoff_induction

[scholarpedia-2] 2.0 ^2.1 Marcus Hutter; Shane Legg; Paul M.B. Vitanyi. "Algorithmic probability". Scholarpedia. 2007.

[sterkenburg-3] 3.0 ^3.1 ^3.2 ^3.3 ^3.4 Tom Florian Sterkenburg. "The Foundations of Solomonoff Prediction". February 2013.

[1]

[2]

[3]

@@ Line 23: / Line 23: @@
 | Sterkenburg (p. 26)<ref name="sterkenburg"/> || <math>P_{\mathrm{II}}(\sigma) = \lim_{\epsilon\to 0} \lim_{n\to \infty} \sum_{\tau \in T_{\sigma,n}} \left(\frac{1-\epsilon}{2}\right)^{|\tau|}</math> || deterministic || universal Turing machine, universal prefix machine (to get a probability distribution; see remark on p. 27) || || The use of the <math>\epsilon</math> is a hack to get the sum to converge
 |-
-| Sterkenburg (p. 29)<ref name="sterkenburg"/> || <math>Q_U(\sigma) = \sum_{\tau \in T_\sigma} 2^{-|\tau|}</math> where <math>T_\sigma</math> is the set of minimal descriptions of <math>\sigma</math> (i.e. set of programs that output something starting with <math>\sigma</math> such that if one removes one bit from the end of the program, it no longer outputs something starting with <math>\sigma</math>) ||
+| Sterkenburg (p. 29)<ref name="sterkenburg"/> || <math>Q_U(\sigma) = \sum_{\tau \in T_\sigma} 2^{-|\tau|}</math> where <math>T_\sigma</math> is the set of minimal descriptions of <math>\sigma</math> (i.e. set of programs that output something starting with <math>\sigma</math> such that if one removes one bit from the end of the program, it no longer outputs something starting with <math>\sigma</math>) || deterministic || universal monotone machine || continuous? ||
 |}