3

I am searching for a computationally simple way to approximate a Bayes Factor. Currently, I'm using an approach which seems pretty logical to me but I would still be interested to know if this is correct:

I repeat sampling parameters from the priors, compute the likelihoods, and in the end, compute the average of all those likelihoods. I do this for both models and build the ratio of both averaged likelihoods as an approximation to the Bayes Factor.

Assuming this is correct, I think the only limitation of this approach is that it becomes computationally intensive with a growing number of parameters?

So, is this approach fine for a small number of parameters? Are there any limitations to this approach? Is there a (simple) alternative way?

beginneR
  • 731

1 Answers1

2

I'm honestly a little bit unsure of what you are currently doing. If I understand you correctly, you sample from the prior of your parameters, and then compute the empirical mean of the likelihoods of each of those values ? If this is indeed the case, this is a correct method to compute the Bayes factor which is called "importance sampling". It will give you the correct answer if you generate a sufficient number of samples from the prior, but it might require a big number of samples to do so.

You could do many things as an alternative. If you can determine the maximum a posteriori (MAP) value for your parameter, one common approximation is:

$$ \int p(data,\theta) d\theta \approx p(data,\theta_{MAP}) $$

which is just using the likelihood of the MAP value

A correction to that consists in using the Laplace approximation of $p(data,\theta)$. This gives:

$$ \int p(data,\theta) d\theta \approx p(data,\theta_{MAP}) \sqrt{H} / \sqrt{2\pi} $$

where $H$ is the second log derivative (against $\theta$) of $p(data,\theta)$

There are also further more complicated approximations of the Bayes factor given by algorithms such as VB-EM and Expectation propagation. All of these methods give approximations of the Bayes factor than can never hope to be perfect (but might be good enough for whatever it is you want to do)

For methods that can give exact results, only sampling solutions exist. Your importance sampling is a good way. A (possibly better) alternative is to do importance sampling with a better approximation of the posterior $q(\theta)$. You take samples from $q(\theta)$ and then compute the empirical mean of the ratio $p(data,\theta) / q(\theta)$

I think that exhausts the possibilities

  • Thank you for your detailed answer. I will try to use the "MAP-method" and compare it with my previous approach. – beginneR Jul 29 '15 at 12:15