5

I'm following up on this great answer regarding running Principal Component Analysis (PCA) to uncover the reason behind lack of convergence and/or singularity for Mixed-Effects models.

My model below doesn't convergence, however, I wonder why I don't get a convergence when model is not singular?

NOTE: When we drop the correlation between intercepts and slope the model below becomes singular! i.e., lmer(math ~ ses*sector + (ses || sch.id), data = dat)

library(lme4)

dat <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/nc.csv')

m4 <- lmer(math ~ ses*sector + (ses | sch.id), data = dat)

summary(m4) Random effects: Groups Name Variance Std.Dev. Corr sch.id (Intercept) 3.6302 1.9053
ses 0.7356 0.8577 0.46 Residual 10.1951 3.1930
Number of obs: 7185, groups: sch.id, 160

summary(rePCA(m4)) $sch.id Importance of components: [,1] [,2] Standard deviation 0.6118 0.2321 Proportion of Variance 0.8742 0.1258 Cumulative Proportion 0.8742 1.0000

rnorouzian
  • 3,986

1 Answers1

4

It is importatnt to understand that non-convergence and singular fit are not the same thing at all. Indeed, when you obtain a singular fit, the model has actually converged, whereas when you obtain a non-convergence warning, the model has not converged. Running principal components analysis on the variance covariance matrix of random effects for a model that has not converged is not going to help, unless the underlying problem was an overfitted random effects structure and the optimizer had happened so stop at a location where this could be identified. This seems unlikely in practice.

The optimizer will stop and generate a warning when certain conditions are met. Sometimes it is possible to change the parameters in the optimiser, such as increasing the number of iterations, changing the algorithm, changing the tolerance level etc, and it will converge. Sometimes changing the starting values for the random effects will work, and sometimes it is simply not possible to find a solution.

In this particular case, restarting with current random effects estimates works:

> m4 <- lmer(math ~ ses*sector + (ses | sch.id), data = dat)
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge with max|grad| = 0.00279006 (tol = 0.002, component 1)
> theta <- getME(m4, "theta")
> m5 <- update(m4, start = theta)
Robert Long
  • 60,630
  • 1
    Thanks Rob! But when we drop the correlation between intercepts and slope the model below becomes singular! i.e., lmer(math ~ ses*sector + (ses || sch.id), data = dat) – rnorouzian Sep 29 '20 at 15:04
  • 1
    That does seem strange, but these are complicated models. Note that the fixed effect inference (point estimates and standard errors) in the singular model is pretty much the same as the non-singular model, and if you remove the random slopes altogether, the inference is still the same. The model without random slopes also has a slightly smaller AIC, so I would probably remove the random slopes from the model. I would also look at a plit of the data to see whethere there appears to be much variation in slopes between schools. – Robert Long Sep 29 '20 at 15:59
  • Rob, I can tell you now what's strange about it! The model only contains level-2 predictors (i.e., ses and sector). This model is fundamental incorrect from a model-leveling perspective, right? – rnorouzian Oct 03 '20 at 05:08
  • If ses is average socio-economic status for each school, then random slopes means that each school has it's own slope for ses ? Does that make sense ? – Robert Long Oct 03 '20 at 08:45
  • Rob, ses is NOT average socio-economic status for each school. I also contacted another HLM expert. Here is what he said: "we cannot put level 2 predictors in the random part. Because the level 2 predictor is constant within the cluster, while the HLM random part is the analysis about how the cluster affects predictors. If it's constant that means the cluster has no effect." – rnorouzian Oct 03 '20 at 15:20
  • I know no idea what ses is. I was just using it as an example of something that would be constant at the school level. I am literally saying exactly what you have just quoted: When I said "Does that make sense ? " it is a rhetorical question - of course it does not make sense, because it's constant. – Robert Long Oct 03 '20 at 15:56
  • blme is not fully Bayesian iirc, whereas stan definitely is, so that could be the reason. – Robert Long Oct 03 '20 at 16:59