How to compute Bayesian estimate of a mean using data?

Question

I'm trying to learn Bayesian statistics, but I'm having a lot of trouble actually applying theoretical concepts to data. I'd appreciate any feedback on my line of reasoning.

Say I have historical data containing independent observations of a continuous random variable $X$, where $X$ doesn't follow a normal distribution. Then, I record new data on an individual of interest, $D = (x_{1},\dots, x_{n})$, where $D$ is also not normally distributed. My goal is to give an estimate of the "true" mean of this new individual's data.

To do this, I understand I need both a prior distribution and a likelihood function. What are the next steps here to find the posterior with the given data? Do I have to estimate the distribution of the prior using the population data? Then, how would I use the new data $D$ to specify a likelihood function?

I suppose this is a very broad question; I would just appreciate some general direction. I think the part that is giving me trouble is that the distributions don't lend themselves to this "conjugate prior" situation I've read about, where there is an easy relationship to refer to between the prior, the new data, and the posterior distribution.

score 0 · Accepted Answer · answered Oct 20 '22 at 07:45

Let's call your historical data $D_0$ and your new data $D_1$. The principled Bayesian approach would be that if you have access to the raw $D_0$ data, to pick the prior you would use a Bayesian model for this data, and use the posterior from this analysis as a prior for the $D_1$ data. So if your parameter of interest is $\mu$, you would calculate the posterior as

$$\begin{align} p(\mu | D_1, D_0) &\propto p(D_1 | \mu) \, p(\mu | D_0) \\ &= p(D_1 | \mu) \,p(D_0 | \mu) \,p(\mu) \\ &= p(D_1, D_0 | \mu) \, p(\mu) \end{align}$$

As you can see, by the Bayesian updating, this is the same as using all the data at once.

If you don't have raw data but only summary statistics, or don't want to be so strict, you could pick the priors using a less formal approach, by using the summary statistics. For example, if your prior is Gaussian, you could pick the mean and standard deviation from the $D_0$ summary statistics.

To do this, I understand I need both a prior distribution and a likelihood function. What are the next steps here to find the posterior with the given data?

If you have the prior $p(\theta)$ and likelihood $p(X | \theta)$, all you need to do is to apply Bayes theorem

$$ p(\theta | X) = \frac{p(X | \theta) \, p(\theta)}{\int p(X | \theta) \, p(\theta) \, d\theta} $$

In some cases, you would have conjugate prior and a nice closed-form solution, in some you won't, and you would need to use numerical solutions (like MCMC sampling) to calculate the integral in the denominator.

Do I have to estimate the distribution of the prior using the population data?

No. You can't do that. If you used $D_1$ to calculate the prior and the likelihood, you would use this data twice, leading to an overconfident result. Prior is something that you pick before seeing the data. In your case, you can however pick the prior using $D_0$, as described above.

Then, how would I use the new data $D$ to specify a likelihood function?

You don't pick the likelihood based on the data but on your understanding of the problem. For example, if you know that your data represents counts of something in a fixed interval, that happens at a fixed rate, you could pick the Poisson distribution as a model, because it describes the such scenario. This is of course a bit idealistic, as in real life we peek at the data and consider its characteristics on it when deciding on the model (likelihood and prior) as discussed by Gelman, Simpson, and Betancourt (2017).

How to compute Bayesian estimate of a mean using data?

1 Answers1