Variants of Solomonoff induction: Difference between revisions

From Machinelearning
No edit summary
No edit summary
Line 17: Line 17:
| Sterkenburg (p. 22)<ref name="sterkenburg">Tom Florian Sterkenburg. "The Foundations of Solomonoff Prediction". February 2013.</ref> || <math>P_{\mathrm{I}}(\sigma) = \lim_{n\to\infty} \frac{|T_{\sigma,n}|}{|T_n|}</math> where <math>\sigma</math> is a finite string, <math>T_n</math> is the set of all halting (valid) inputs of length <math>n</math> to the reference machine <math>U</math>, <math>T_{\sigma,n}</math> is the set of all halting (valid) inputs of length <math>n</math> that output something starting with <math>\sigma</math> || deterministic || universal Turing machine (no restrictions on prefix-free-ness) || discrete?
| Sterkenburg (p. 22)<ref name="sterkenburg">Tom Florian Sterkenburg. "The Foundations of Solomonoff Prediction". February 2013.</ref> || <math>P_{\mathrm{I}}(\sigma) = \lim_{n\to\infty} \frac{|T_{\sigma,n}|}{|T_n|}</math> where <math>\sigma</math> is a finite string, <math>T_n</math> is the set of all halting (valid) inputs of length <math>n</math> to the reference machine <math>U</math>, <math>T_{\sigma,n}</math> is the set of all halting (valid) inputs of length <math>n</math> that output something starting with <math>\sigma</math> || deterministic || universal Turing machine (no restrictions on prefix-free-ness) || discrete?
|-
|-
| Sterkenburg (p. 24)<ref name="sterkenburg"/> || <math>P'_{\mathrm{II}}(\sigma) = 2^{-|\tau_\mathrm{min}|}</math> where <math>\tau_\mathrm{min}</math> is the shortest program <math>\tau</math> such that <math>U(\tau) = \sigma</math> (i.e. the shortest program that causes the reference machine to output <math>\sigma</math> and halt) || deterministic || universal Turing machine || discrete? || this formula does not define a probability distribution over strings <math>\sigma</math> because the sum of probabilities does not converge
| Sterkenburg (p. 24)<ref name="sterkenburg"/> || <math>P'_{\mathrm{II}}(\sigma) = 2^{-|\tau_\mathrm{min}|}</math> where <math>\tau_\mathrm{min}</math> is the shortest program <math>\tau</math> such that <math>U(\tau) = \sigma</math> (i.e. the shortest program that causes the reference machine to output <math>\sigma</math> and halt) || deterministic || universal Turing machine, universal prefix machine (to get a probability distribution) || discrete? || this formula does not define a probability distribution over strings <math>\sigma</math> because the sum of probabilities does not converge
|-
|-
| Sterkenburg (p. 25)<ref name="sterkenburg"/> || <math>P''_{\mathrm{II}}(\sigma) = \lim_{n\to\infty} \sum_{\tau \in T_{\sigma,n}} 2^{-|\tau|}</math> where <math>T_{\sigma,n}</math> is the set of all programs <math>\tau</math> of length <math>n</math> such that <math>U(\tau)</math> begins with <math>\sigma</math> || deterministic || universal Turing machine || || <math>P''_{\mathrm{II}}(\sigma)</math> is divergent even for a single <math>\sigma</math>, so this is not actually a workable version, but is intended as a stepping stone
| Sterkenburg (p. 25)<ref name="sterkenburg"/> || <math>P''_{\mathrm{II}}(\sigma) = \lim_{n\to\infty} \sum_{\tau \in T_{\sigma,n}} 2^{-|\tau|}</math> where <math>T_{\sigma,n}</math> is the set of all programs <math>\tau</math> of length <math>n</math> such that <math>U(\tau)</math> begins with <math>\sigma</math> || deterministic || universal Turing machine || || <math>P''_{\mathrm{II}}(\sigma)</math> is divergent even for a single <math>\sigma</math>, so this is not actually a workable version, but is intended as a stepping stone
|-
|-
| Sterkenburg (p. 26)<ref name="sterkenburg"/> || <math>P_{\mathrm{II}}(\sigma) = \lim_{\epsilon\to 0} \lim_{n\to \infty} \sum_{\tau \in T_{\sigma,n}} \left(\frac{1-\epsilon}{2}\right)^{|\tau|}</math> || deterministic || universal Turing machine || || The use of the <math>\epsilon</math> is a hack to get the sum to converge
| Sterkenburg (p. 26)<ref name="sterkenburg"/> || <math>P_{\mathrm{II}}(\sigma) = \lim_{\epsilon\to 0} \lim_{n\to \infty} \sum_{\tau \in T_{\sigma,n}} \left(\frac{1-\epsilon}{2}\right)^{|\tau|}</math> || deterministic || universal Turing machine, universal prefix machine (to get a probability distribution) || || The use of the <math>\epsilon</math> is a hack to get the sum to converge
|}
|}



Revision as of 03:29, 31 March 2019

This page lists some variants of Solomonoff induction.

For determinism, I think "deterministic" is the same as "Solomonoff prior" and "stochastic" is the same as "universal mixture".

For discrete vs continuous, I think this just means whether the prior we define is over finite strings or over infinite sequences (where we want to know the probability of an infinite sequence starting with a given finite string).

Source Formula Determinism Type of machine used Discrete vs continuous Notes
LessWrong Wiki[1] m(y0)=pP:U(p)=y02(p) where P is the set of self-delimiting programs Deterministic Page doesn't say, but uses self-delimiting programs and it's discrete, so prefix Turing machine? Discrete because of the U(p)=y0 rather than U(p)=y0*
Scholarpedia discrete universal a priori probability[2] m(x)=p:U(p)=x2(p) where the sum is over halting programs deterministic? prefix Turing machine discrete
Scholarpedia continuous universal a priori probability[2] M(x)=p:U(p)=x*2(p) where the sum is over minimal programs deterministic? Monotone Turing machine Continuous
Sterkenburg (p. 22)[3] PI(σ)=limn|Tσ,n||Tn| where σ is a finite string, Tn is the set of all halting (valid) inputs of length n to the reference machine U, Tσ,n is the set of all halting (valid) inputs of length n that output something starting with σ deterministic universal Turing machine (no restrictions on prefix-free-ness) discrete?
Sterkenburg (p. 24)[3] P'II(σ)=2|τmin| where τmin is the shortest program τ such that U(τ)=σ (i.e. the shortest program that causes the reference machine to output σ and halt) deterministic universal Turing machine, universal prefix machine (to get a probability distribution) discrete? this formula does not define a probability distribution over strings σ because the sum of probabilities does not converge
Sterkenburg (p. 25)[3] P'II(σ)=limnτTσ,n2|τ| where Tσ,n is the set of all programs τ of length n such that U(τ) begins with σ deterministic universal Turing machine P'II(σ) is divergent even for a single σ, so this is not actually a workable version, but is intended as a stepping stone
Sterkenburg (p. 26)[3] PII(σ)=limϵ0limnτTσ,n(1ϵ2)|τ| deterministic universal Turing machine, universal prefix machine (to get a probability distribution) The use of the ϵ is a hack to get the sum to converge

References

  1. https://wiki.lesswrong.com/wiki/Solomonoff_induction
  2. 2.0 2.1 Marcus Hutter; Shane Legg; Paul M.B. Vitanyi. "Algorithmic probability". Scholarpedia. 2007.
  3. 3.0 3.1 3.2 3.3 Tom Florian Sterkenburg. "The Foundations of Solomonoff Prediction". February 2013.