3

In rejection sampling or Markov chain Monte Carlo methods, we usually have a target distribution $p(x)$ whose form makes it difficult or impossible to draw samples directly, but we can evaluate $p(x)$ up to a normalising constant. We then sample from a simpler distribution which is proportional to $p(x)$.

My problem appears here: In order to accept/reject the proposed value $x'$, we evaluate the value of $p(x')$ and see if its within its bounds.

Why can we evaluate a value $p(x)$, but not sample directly from $p(.)$? It seems like if we can resolve $p(x)$ for all $x$, say uniformly sampled over the support of $p(.)$, we can directly assess the shape of $p(x)$ without any distribution convergence steps.

If someone could spot where I am confused, it would be very appreciated, as I don´t seem to find any clear explanation on this issue anywhere (must be really trivial..! )

Sobi
  • 2,251
hirschme
  • 1,120
  • If your distribution has a very narrow peak, you might miss this peak with uniform sampling. – ziggystar Dec 15 '15 at 16:11
  • 1
    You seem to be jumping between something that sounds like Gibbs sampling and something that sounds like Metropolis-Hastings, possibly with a little of something else thrown in. It sounds like you've read a bunch of things but have not done any MCMC and the ideas have become jumbled together in your head. Choose a particular thing to ask about (like MH), and try to ask a more specific question. – Glen_b Dec 17 '15 at 07:08
  • @Glen_b You are definitely right with that assumption. Nevertheless I feel that Sobi helped with the main issues I had, specially with the important difference between sampling and evaluating. Hopefully the rest will start to clear after implementing MCMC methods in practice. – hirschme Dec 18 '15 at 12:32

2 Answers2

4

I think what you have in mind is to evaluate $p(x), \forall x \in \Omega$ and then treat it as a discrete distribution and pick an outcome at random and according to the probabilities (which we know how to do for a discrete distribution).

The problem with that, however, is that $|\Omega|$ is often extremely large and in some cases infinite (if $x$ is continuous) making the above approach not practical.

Also, it is important to realize the distinction between being able to draw a sample from a distribution and being able to evaluate a distribution at an arbitrary outcome. The latter does not mean that you can sample from the distribution easily. As far as I know, we only know how to draw samples from a uniform distribution, and sampling from all other distributions (e.g. Gaussian, categorical, etc.) is done by applying various tricks on uniformly drawn samples (I might be wrong on this though).

Sobi
  • 2,251
3

The method is extensively used to sample from a posterior distribution in Bayesian statistics. The following refers to that situation.

Following Bayes' theorem, the posterior distribution is proportional to the prior times the likelihood. In notations:

$$ p(\theta | x) \propto p(\theta) \times p(x | \theta) $$

Sometimes, $p(\theta | x)$ corresponds to a known distribution, like Student. In that case, values can directly be sampled from that distribution and MCMC is not needed.

But, $p(\theta | x)$ does not always correspond to a known distribution...

Given the prior and the likelihood, however, you can always resolve the posterior at every parameter value up to a multiplicative constant (cf. Bayes' theorem above). As explained by @Sobi, "Uniform sampling" (as you suggest) will not work for continuous distributions. But, the MCMC method applies.

ocram
  • 21,851