7

By "permissible" (for lack of a better term) I mean models which despite of a "flat" (improper) prior (i.e., $\int_{\Theta} p(\theta) d \theta = + \infty$) nevertheless produce a proper posterior (i.e., $\int_{\Theta} p(\theta|\mathbf{x}) d \theta = 1$). Under those circumstances, the likelihood does the heavy lifting, and the MAP will equal the MLE.

But there are of course many models whose likelihood does not have a (unique) maximum and usually require some additional constraints to be estimable. So intuition says that for those models, a flat prior would produce an improper posterior. Is that the case, and if not, what characterizes the exceptions?

EDIT: Having looked through the archives some more, I noticed that a related questions was answered here.

Durden
  • 1,171

2 Answers2

5

No, these are somewhat different problems. If you have an improper flat prior and you don't have a unique MLE, you will often not have a unique posterior mode, so neither MLE nor MAP estimation will be useful without some additional thought/constraints. But you can easily have a proper posterior.

Some examples:

  • Mixture models, where there is non-identifiability because of relabelling. There will still be relabelling in the posterior, but the posterior will be proper as long as the mixing probabilities are bounded away from zero
  • 'Flat' or nearly flat regions in the likelihood: if you have $2\times 2$ table where you only observe the margins, the odds ratio is non-identifiable and the likelihood is nearly flat over some range of values. Given a flat prior, you'd get a flat posterior over that range. However, the flat range will typically be bounded so that the posterior is proper.
  • it's quite possible to have non-identifiability with bounded parameter spaces, so even a flat posterior would be proper. Suppose $Y\sim Binomial(1,p_1)$ and you have a flat prior over $[0,1]\times[0,1]$ for $(p_1,p_2)$. The posterior for $p_2$ (about which you have no data) will still be flat, but it will not be improper.

Conversely, you can get an improper posterior without non-identifiability. Hobert and Casella discuss this for linear mixed models here. They don't explicitly use flat priors, but their improper priors could be regarded as flat for some transformed parameter.

One situation where you can get an improper posterior from non-identifiability is when the likelihood is flat on a unbounded subspace of the parameter space. Suppose you have a model $Y\sim N(\alpha+\beta,1)$. The data only tell you about $\alpha+\beta$ and your posterior for $\alpha-\beta$ will be flat if the prior is flat.

Thomas Lumley
  • 38,062
3

If the prior is uniform then

$$f(\theta|x) \propto \frac{\mathcal{L}(\theta,x)}{\int_{\theta \in \Theta} \mathcal{L}(\theta,x) d\theta}$$

And this is a proper distribution when the integral of the likelihood function in the denominator is finite.

An simple example where this is not gonna work is when for a particular observation $x$ the likelihood is above some finite value in an infinite range of the parameters. For example consider Poisson distribution $X \sim Poisson(\lambda=1/\theta)$ and the observation $x=0$, then the likelihood is equal to $\mathcal{L}(\theta,0) = e^{-1/\theta}$ then we need to compute the integral $\int_0^\infty e^{-1/\theta} d\theta$ which diverges and has no finite value.