When specifying a Bayesian model, one can specify weakly-informative priors. However, such priors may represent a concern to many researchers. After all, if they are weakly-informed, one may be concerned that the imprecision of such a prior may be biasing the results. It is also my understanding that the influence of the prior diminishes as more data is introduced. However, the prior may be very influential if the N is small. Given this, if you know beforehand that your priors are weak and that you have a small sample size, should you avoid Bayesian inference?
-
6To the contrary, objective Bayesian priors have the effect of smoothing parameter estimates in small samples and can be helpful. The classical example of this phenomenon is the reference Beta(0.5,0.5) prior with the binomial likelihood. – John Madden Jul 19 '23 at 18:16
-
6This gets the logic backwards. The prior dominates when faced with weak data precisely because there's not much information in the data, so you should depend more on previous knowledge/expectation. Using a prior only when you have strong data is doing Bayesian inference when it is least useful. – mkt Jul 19 '23 at 19:40
-
3Instead of informative priors, weakly or otherwise, you can also pick non-informative priors. But then you're essentially doing maximum likelihood estimation under a different name. – Durden Jul 19 '23 at 23:05
3 Answers
There are two issues here: 1) You haven't got much data 2) You haven't got strong priors.
The first is a problem whether you are a frequentist or a Bayesian or anything else. Not much data yields imprecise estimates, dangers of over-fitting and so on. I don't see how the size of your data set influences whether you should go Bayesian or not. It might mean that you shouldn't do anything except get more data, but that's a different issue.
As to weak priors being a concern, I think that's just backwards. I would be much more leery if I read a study that had very strong priors. I am not a Bayesian, but, to me, that raises the question: If you already know the answer, why are you doing the study?
There may be a good answer to that question, it depends on the situation. But it would alert me and make me curious.
- 119,535
- 36
- 175
- 383
-
2I'd just add that if the prior is strong and it works against what you're trying to show, the result should be seen as convincing. – Frank Harrell Jul 21 '23 at 11:12
I'm not sure what point you are trying to make. Weakly informative priors affect the results weakly. If you used less weakly informative priors, they would affect the result even more. With a growing sample size, the influence of the prior diminishes, but still with a very strong prior this may be slow. If your point is to remove the impact of the priors, why bother using the Bayesian approach at all?
As for the small sample size, keep in mind that a non-Bayesian approach would not be a remedy. If the sample is small, it may be very vulnerable to small differences in the data. Even something as simple as an arithmetic mean can fluctuate a lot in the small sample if you change even a single value. For some of the more complicated models, you may not be able to estimate their parameters at all if you don't have enough data (while it could be possible with the Bayesian approach). Also, non-Bayesian models have their own assumptions and parameters that do affect the results. So given the reasons above, one could paraphrase the question as: "Should we do statistics with small samples at all?" The problem is that sometimes we have to because there is simply no more data available.
Rather than "Am I using a prior that is strong or weak?" (in the sense of concentrated or diffuse over the parameter space), it depends on why you are using that particular prior. Do you actually trust your prior, or was it chosen arbitrarily / for convenience?
On the one hand, we may have a well-justified prior (because of sincerely held beliefs; or lots of prior data; or a reference prior that is widely agreed to be useful by people in our scientific domain, like @JohnMadden's beta/binomial example in some fields), and we want to update it with new data. Then even with small n, a Bayesian would OK with the prior having a big effect on the posterior.
On the other hand, if we don't have a prior that we trust, and we're just plugging a prior into our calculations for convenience, that's when your concern is valid: When n is small, the posterior can be very sensitive to tiny changes in the prior (even a weakly-informative prior). Due to this sensitivity, using an arbitrary prior just adds noise to your results.
For large n, the effect of the prior should wash out. But for small n, if you didn't trust the prior before you saw the data, you shouldn't trust the posterior either. With small n and no reliable prior, instead of a Bayesian analysis---or even a Frequentist analysis (which may just confirm that "The sample is too small to estimate these parameters with adequate precision")---I would just report descriptive statistics / graphs and be very transparent about the study's limitations: due to the sample size, our results cannot safely generalize to the population, etc.
- 3,785
-
I don't follow this. If you don't trust the prior, why wouldn't you just use a weaker or different one? – mkt Jul 22 '23 at 17:25
-
Switching to a different weak prior is reasonable if you can say "I may not know much about the parameter -- but I firmly believe that this prior encapsulates exactly how little I know about it." But eliciting priors can be difficult. If instead all you can say is "Gee whiz, I have no idea what I do and don't know about the parameter," how should you choose a specific weak prior? For large samples it might not matter; for small samples, it does. – civilstat Jul 25 '23 at 14:25
-
Hmmm. I agree with your logic, but I find it hard to think of a scenario in which one cannot come up with even an extremely weak prior. And if we truly had zero knowledge, we probably should be doing descriptive work or theory instead of inference. – mkt Jul 25 '23 at 14:32
-
I think we agree. We can almost always write down an extremely weak prior. But if we can write down several extremely weak priors and we have too little knowledge to confidently choose one of them -- and our sample's so small that the posterior will be sensitive to our choice among these weak priors -- then probably we should indeed do descriptive work or theory instead of inference. (Or collect more data!) – civilstat Jul 25 '23 at 14:39