3

I'm running some regression models in R using brms and lme4.

When I run a Bayesian model:

priors.mod.1 <- c(set_prior("normal(0, 1)", class = "Intercept"),
                  set_prior("normal(0, 1)", class = "b"),
                  set_prior("cauchy(0, 1)", class = "sd"),
                  set_prior("cauchy(0, 1)", class = "sigma"), 
                  set_prior("dirichlet(1, 1)", class = "simo", coef = "motest.time.num1"),
                  set_prior("dirichlet(1, 1)", class = "simo", coef = "motest.time.num:training.dum1"),
                  set_prior("dirichlet(1, 1)", class = "simo", coef = "motest.time.num:Condition.dum1"),
                  set_prior("dirichlet(1, 1)", class = "simo", coef = "motest.time.num:training.dum:Condition.dum1"),
                  set_prior("lkj(2)", class = "cor"))

maximal.mod.ln <- brm(RT ~  mo(test.time.num) * training.dum * Condition.dum + (mo(test.time.num) * training.dum | Ppt.No),
                      family = gaussian(),
                      chain = 4,
                      iter = 2000,
                      warmup = 1000,
                      seed = 1234, 
                      prior = priors.mod.1,
                      save_pars = save_pars(all = TRUE),
                      data = dat)

My population level estimates look like this:

enter image description here

However when I run a frequentist model

mod <- lmer(RT ~  test.time.num * training.dum * Condition.dum + (test.time.num * training.dum | Ppt.No),
                      data = dat)

they look like this:

enter image description here

Does anybody know why the estimates are so different (my random effect estimates are also wildly different in terms of scale)? My outcome variable (RT) is measured in milliseconds and so the the output from the frequentist models looks as if the coefficients are in the outcome variable units. I have tried a variety of prior distributions but the Bayesian results seem robust.

It looks as if the Bayesian coefficient estimates are standardised. And if I standardise my frequentist model coefficients, I get some similar looking values:

enter image description here

but I've looked online I don't think this is a default for brms models. Perhaps it is because I am modelling one of my variables as a monotonic effect within the Bayesian model? Could this cause my coefficient estimates to be standardised?

Any help or advance would be most appreciated!

  • Two things I'd like to know. (1) I'm just looking at your set up for the markov chain. iter = 2000 and warmup = 1000 is quite a short markov chain. What happens if you set iter = 2 * 10^4 and warmup = 10^4. (2) Also, what is your sample size? It could just be the case that the prior is much stronger than the likelihood (because of small n?), thus the posterior is fairly close to the prior. – jcken Mar 30 '22 at 09:17
  • Hi @jcken thanks for your comments. 1) I will try this set up and get back to you. 2) Sample size is 58 participants. – codegoblin1996 Mar 30 '22 at 09:23
  • $n=58$ is quite small (and the number of parameters is 9?) so it might just be a case of the prior dominating your likelihood/data. The $Normal(0,1)$ prior tells me that you think the parameters with this prior will lie in $(-2,2)$ with probability $0.95$. Once you have checked that a longer markov chain doesn't notably change the results, you should criticise your prior. I have no domain knowledge of your experiment, so take this recommendation with a pinch of salt: try Normal(0, 1000). Does it give you answers close lmer? Apply other "diffuse" priors to the other parameters too. – jcken Mar 30 '22 at 09:54
  • @jcken diffuse priors don't seem to change the estimates. Trying a combination of diffuse priors and more iterations (could be a while, my computer isn't the fastest!) – codegoblin1996 Mar 30 '22 at 12:35
  • @jcken no luck with the extra iterations/warm ups - still getting the same estimates. But thank yo for your help. – codegoblin1996 Mar 30 '22 at 17:44

1 Answers1

4

These two models have different design matrices, so it's not surprising you get different results.

Model A:

RT ~ mo(test.time.num) * training.dum * Condition.dum + (mo(test.time.num) * training.dum | Ppt.No)

Model B:

RT ~ test.time.num * training.dum * Condition.dum + (test.time.num * training.dum | Ppt.No)

So in Model A you specify time as a monotonic predictor. brms::mo is intended for ordinal categorical variables, so why apply it to time? Or maybe that's the appropriate way to handle the time variable but then you don't do it in Model B.

PS. And yes, brms centers continuous predictors as this improves convergence. However, it reports coefficient estimates on the original scale. And brms is not a black box; you can easily see the Stan code behind the brms model with the functions make_stancode (without fitting) or stancode (after fitting).

dipetkov
  • 9,805
  • Just noting that one of the important implications of brms temporarily centering predictors during fitting is that priors on the intercept should be set accordingly. I.e., priors on the intercept are set for all continuous variables at their mean. – Lachlan Mar 30 '22 at 22:46
  • 2
    @Lachlan Good point. I didn't look at the priors because I think the results are explained by the difference in model/likelihood. But now that you mention priors, the intercept in the LMM is 1376, so normal(0,1) prior for the intercept in the hierarchical model is probably not a good idea. – dipetkov Mar 30 '22 at 22:54
  • Thanks for your comments. I wasn't sure how to specify a monotonic variable in lme4 but a monotonic is definitely the right choice for what i need. However my results are still the same whether I treat test.time.num as monotonic or continuous in the Bayesian model. I will try adjusting my intercept prior, thanks again. – codegoblin1996 Mar 31 '22 at 08:36
  • A bit of searching on CV turned this answer from Ben Bolker about including an ordered categorical predictor in an LMM. Also consider starting simple, eg. simulate data with one ordered factor, as you try to figure out how lmer and brms parametrize the model. – dipetkov Mar 31 '22 at 08:56
  • Cool thank you @dipetkov still playing around with the priors and still not finding different estimates. Will keep playing around with the parameterisation of the model and update here if I find out anything – codegoblin1996 Mar 31 '22 at 09:48
  • 1
    One way I often check results is to run the Bayesian procedure with flat priors and request that Stan compute the posterior mode, then check this against the frequentist maximum likelihood estimates. – Frank Harrell Aug 21 '22 at 11:26