5

I have this mixed effects logistic regression model. All the predictors are categorical (I need to maintain also age categorical, not as a continuous variable). The predictors are codified with orthogonal sum to zero contrasts. However, I have a problem of near to perfect separation of data.

mod<- glmer(score ~ 1 +age + gender + real*speaker+ (1|part) + (1|item), 
           data = data, family = binomial(link = "logit"), control = glmerControl(optimizer="bobyqa"))

I have read from the link here that for mixed models the only solution is to switch to a Bayesian model since a penalized regression is not available for mixed models. I have decided to use the brms package since it has a syntax very similar to lme4. However, I am not confident with Bayesian models and priors, and I am not sure about the results.

mod_bayesian<- brm(score ~ 1 +age + gender + real*speaker+ (1|part) + (1|item), 
           data = data, family = bernoulli, iter = 1000, chains = 4, cores = 4)

Are mod and mod_bayesian equivalent? Or, better, what is the Bayesian equivalent to mod? How can I detect how to select priors?

Katherine
  • 165
  • 5

2 Answers2

1

The posterior and likelihood, and their marginals, are equivalent when you use uniform priors. So the maximum likelihood estimate and maximum a posteriori estimate are equivalent when you use uniform priors (but not the confidence intervals, which don't use the likelihood distribution but the fiducial distribution).

Except for the random effects which are assumed to be normal distributed, but the prior on the covariance matrix is equivalent to being considered to be uniform (I am not 100% sure however, I believe that priors might be used to make computations easier for variances of random effects that may become very small).

Also, if you look for the maximum marginal a posteriori estimate, then you have to integrate the random effects, but not the covariance matrices. I am not sure how to do this in STAN.


As an alternative approach: penalties can be incorporated by adding an additional observation. See Ridge penalized GLMs using row augmentation?

  • This question speaks about priors for the covariance matrix https://stats.stackexchange.com/questions/354327/unbounded-likelihoods-for-unpenalized-mixed-effects but it seems that it was not the regular mixed effects glm. – Sextus Empiricus Mar 31 '24 at 08:15
0

Brief Answer

I would heavily recommend reading up on Bayes before using Bayes.

There is a serious danger of the defaults (Gelman & Yao, 2021; Moyé, 2008; Smid & Winter, 2020), which are the uniform priors referred to by Sextus (which he simply describes and probably isn't explicitly advocating for). Some practical reading exists out there (see Statistical Rethinking in the references for a less math-intensive text) which can help shore up this hole. The main thing you need to consider is a prior that actually makes sense for your scenario. As an example, what is a reasonable mean score in the literature for gender? How much does it vary? One could straightforwardly use a normally distributed prior with a mean $\mu$ and variance $\sigma$ as $N(\mu,\sigma)$ on this coefficient given it can be positive or negative and fluctuate around a mean. However, prior understanding of the research should help provide insight on selection of the actual mean and variance values that should go here.

To your question, in some respects models with uniform priors essentially produce the same thing as frequentist models, so it's not usually useful to use such priors if one wants to be "truly Bayesian". I would first investigate what is causing the near perfect separation and determine if Bayes is even necessary. There may be a solution that may be much easier if there is a way to fix the issue first, but first one must determine why the outcome is essentially perfectly predicted by the categories given.

As a final comment, I'm not sure why you need to make age categorical. Binning continuous variables is in general a poor practice (Royston, 2006).

References

  • Gelman, A., & Yao, Y. (2021). Holes in Bayesian statistics. Journal of Physics G: Nuclear and Particle Physics, 48(1), 014002. https://doi.org/10.1088/1361-6471/abc3a5
  • McElreath, R. (2020). Statistical rethinking: A Bayesian course with examples in R and Stan (2nd ed.). Taylor and Francis, CRC Press.
  • Moyé, L. A. (2008). Bayesians in clinical trials: Asleep at the switch. Statistics in Medicine, 27(4), 469–482. https://doi.org/10.1002/sim.2928
  • Royston, P., Altman, D. G., & Sauerbrei, W. (2006). Dichotomizing continuous predictors in multiple regression: A bad idea. Statistics in Medicine, 25, 127–141. https://doi.org/10.1002/sim.2331
  • Smid, S. C., & Winter, S. D. (2020). Dangers of the defaults: A tutorial on the impact of default priors when using Bayesian SEM with small samples. Frontiers in Psychology, 11, 611963. https://doi.org/10.3389/fpsyg.2020.611963
  • 1
    „in some respects models with uniform priors essentially produce the same thing as frequentist models, so it's not usually useful to use such priors if one wants to be "truly Bayesian".” I referred to uniform priors, but I hope that this didn't came across as a suggestion for a actually using these priors. I saw/interpreted the question as reading up on Bayes instead of using Bayes, and being about the curiosity whether there is some default blank canvas, that makes the fit from glmer equivalent to brm, and a viewpoint from which one can interpret the penalties. – Sextus Empiricus Mar 31 '24 at 11:05
  • No I understood the context of your answer as simply explaining what those priors were, and not as an explicit expression that you felt they should be used. I'll edit my answer to make that more clear. – Shawn Hemelstrand Mar 31 '24 at 11:06
  • @ShawnHemelstrand The near to perfect separation of data is due to the fact that my control group (Age=adults) completely predicts the outcome. Namely, all the adult participants answer correct to the test items. However, I need to maintain age as a categorical variable since I am interested in showing contrasts between age groups. – Katherine Mar 31 '24 at 11:45