Variants of Solomonoff induction: Difference between revisions

Revision as of 03:29, 31 March 2019

This page lists some variants of Solomonoff induction.

For determinism, I think "deterministic" is the same as "Solomonoff prior" and "stochastic" is the same as "universal mixture".

For discrete vs continuous, I think this just means whether the prior we define is over finite strings or over infinite sequences (where we want to know the probability of an infinite sequence starting with a given finite string).

Source	Formula	Determinism	Type of machine used	Discrete vs continuous	Notes
LessWrong Wiki^[1]	$m (y_{0}) = \sum_{p \in P : U (p) = y_{0}} 2^{- ℓ (p)}$ where $P$ is the set of self-delimiting programs	Deterministic	Page doesn't say, but uses self-delimiting programs and it's discrete, so prefix Turing machine?	Discrete because of the $U (p) = y_{0}$ rather than $U (p) = y_{0} *$
Scholarpedia discrete universal a priori probability^[2]	$m (x) = \sum_{p : U (p) = x} 2^{- ℓ (p)}$ where the sum is over halting programs	deterministic?	prefix Turing machine	discrete
Scholarpedia continuous universal a priori probability^[2]	$M (x) = \sum_{p : U (p) = x *} 2^{- ℓ (p)}$ where the sum is over minimal programs	deterministic?	Monotone Turing machine	Continuous
Sterkenburg (p. 22)^[3]	$P_{I} (σ) = lim_{n \to \infty} \frac{\| T_{σ, n} \|}{\| T_{n} \|}$ where $σ$ is a finite string, $T_{n}$ is the set of all halting (valid) inputs of length $n$ to the reference machine $U$ , $T_{σ, n}$ is the set of all halting (valid) inputs of length $n$ that output something starting with $σ$	deterministic	universal Turing machine (no restrictions on prefix-free-ness)	discrete?
Sterkenburg (p. 24)^[3]	$P'_{I I} (σ) = 2^{- \| τ_{m i n} \|}$ where $τ_{m i n}$ is the shortest program $τ$ such that $U (τ) = σ$ (i.e. the shortest program that causes the reference machine to output $σ$ and halt)	deterministic	universal Turing machine, universal prefix machine (to get a probability distribution)	discrete?	this formula does not define a probability distribution over strings $σ$ because the sum of probabilities does not converge
Sterkenburg (p. 25)^[3]	$P^{'}'_{I I} (σ) = lim_{n \to \infty} \sum_{τ \in T_{σ, n}} 2^{- \| τ \|}$ where $T_{σ, n}$ is the set of all programs $τ$ of length $n$ such that $U (τ)$ begins with $σ$	deterministic	universal Turing machine		$P^{'}'_{I I} (σ)$ is divergent even for a single $σ$ , so this is not actually a workable version, but is intended as a stepping stone
Sterkenburg (p. 26)^[3]	$P_{I I} (σ) = lim_{ϵ \to 0} lim_{n \to \infty} \sum_{τ \in T_{σ, n}} {(\frac{1 - ϵ}{2})}^{\| τ \|}$	deterministic	universal Turing machine, universal prefix machine (to get a probability distribution)		The use of the $ϵ$ is a hack to get the sum to converge

References

↑ https://wiki.lesswrong.com/wiki/Solomonoff_induction
↑ ^2.0 ^2.1 Marcus Hutter; Shane Legg; Paul M.B. Vitanyi. "Algorithmic probability". Scholarpedia. 2007.
↑ ^3.0 ^3.1 ^3.2 ^3.3 Tom Florian Sterkenburg. "The Foundations of Solomonoff Prediction". February 2013.

[1] ttps://wiki.lesswrong.com/wiki/Solomonoff_induction

[scholarpedia-2] 2.0 ^2.1 Marcus Hutter; Shane Legg; Paul M.B. Vitanyi. "Algorithmic probability". Scholarpedia. 2007.

[sterkenburg-3] 3.0 ^3.1 ^3.2 ^3.3 Tom Florian Sterkenburg. "The Foundations of Solomonoff Prediction". February 2013.

[1]

[2]

[3]

@@ Line 17: / Line 17: @@
 | Sterkenburg (p. 22)<ref name="sterkenburg">Tom Florian Sterkenburg. "The Foundations of Solomonoff Prediction". February 2013.</ref> || <math>P_{\mathrm{I}}(\sigma) = \lim_{n\to\infty} \frac{|T_{\sigma,n}|}{|T_n|}</math> where <math>\sigma</math> is a finite string, <math>T_n</math> is the set of all halting (valid) inputs of length <math>n</math> to the reference machine <math>U</math>, <math>T_{\sigma,n}</math> is the set of all halting (valid) inputs of length <math>n</math> that output something starting with <math>\sigma</math> || deterministic || universal Turing machine (no restrictions on prefix-free-ness) || discrete?
 |-
-| Sterkenburg (p. 24)<ref name="sterkenburg"/> || <math>P'_{\mathrm{II}}(\sigma) = 2^{-|\tau_\mathrm{min}|}</math> where <math>\tau_\mathrm{min}</math> is the shortest program <math>\tau</math> such that <math>U(\tau) = \sigma</math> (i.e. the shortest program that causes the reference machine to output <math>\sigma</math> and halt) || deterministic || universal Turing machine || discrete? || this formula does not define a probability distribution over strings <math>\sigma</math> because the sum of probabilities does not converge
+| Sterkenburg (p. 24)<ref name="sterkenburg"/> || <math>P'_{\mathrm{II}}(\sigma) = 2^{-|\tau_\mathrm{min}|}</math> where <math>\tau_\mathrm{min}</math> is the shortest program <math>\tau</math> such that <math>U(\tau) = \sigma</math> (i.e. the shortest program that causes the reference machine to output <math>\sigma</math> and halt) || deterministic || universal Turing machine, universal prefix machine (to get a probability distribution) || discrete? || this formula does not define a probability distribution over strings <math>\sigma</math> because the sum of probabilities does not converge
 |-
 | Sterkenburg (p. 25)<ref name="sterkenburg"/> || <math>P''_{\mathrm{II}}(\sigma) = \lim_{n\to\infty} \sum_{\tau \in T_{\sigma,n}} 2^{-|\tau|}</math> where <math>T_{\sigma,n}</math> is the set of all programs <math>\tau</math> of length <math>n</math> such that <math>U(\tau)</math> begins with <math>\sigma</math> || deterministic || universal Turing machine || || <math>P''_{\mathrm{II}}(\sigma)</math> is divergent even for a single <math>\sigma</math>, so this is not actually a workable version, but is intended as a stepping stone
 |-
-| Sterkenburg (p. 26)<ref name="sterkenburg"/> || <math>P_{\mathrm{II}}(\sigma) = \lim_{\epsilon\to 0} \lim_{n\to \infty} \sum_{\tau \in T_{\sigma,n}} \left(\frac{1-\epsilon}{2}\right)^{|\tau|}</math> || deterministic || universal Turing machine || || The use of the <math>\epsilon</math> is a hack to get the sum to converge
+| Sterkenburg (p. 26)<ref name="sterkenburg"/> || <math>P_{\mathrm{II}}(\sigma) = \lim_{\epsilon\to 0} \lim_{n\to \infty} \sum_{\tau \in T_{\sigma,n}} \left(\frac{1-\epsilon}{2}\right)^{|\tau|}</math> || deterministic || universal Turing machine, universal prefix machine (to get a probability distribution) || || The use of the <math>\epsilon</math> is a hack to get the sum to converge
 |}