User:IssaRice/Metropolis–Hastings algorithm

without exception, every single explanation i have seen so far of this absolutely sucks. like, not just "most really suck, and some suck a little". literally everything just sucks really bad. this might be my best guess for the most horribly-explained thing ever.

in my opinion, the things a good explanation must cover are:

what the heck is sampling, even? once we have a fair coin, use that to generate samples for:
- arbitrary biased coin
- a discrete uniform distribution over 1,...,n
- a continuous uniform(0,1) distribution
- use a continuous uniform to sample from an arbitrary distribution using inverse transform sampling
- bonus: go from a biased coin (with unknown bias) to a fair coin
why doesn't inverse transform sampling work in situations where we have to use metropolis-hastings?
- like, what the heck does it mean to "only" have access to some constant multiple of the pdf? why would we ever get into such a situation, and why can't we just normalize and then numerically approximate the cdf, and then get the inverse to do inverse transform sampling??? literally NONE of the explanations even RAISE this question. why????
an actually convincing example of MCMC. the stuff i've seen so far are so boring i just don't even care if we can sample from it.
where the heck does the accept/reject rule come from? why this division thing to get the threshold?
why do we need a transition matrix, can this matrix be literally anything, and why do we care if it's symmetric?