I am trying to get a deeper understanding of failures to converge in multilevel models that I estimate with lmer(). "Failure to converge" is vague; I want to be able to specify the problem that underpins these failures, and to express it numerically in terms of the data that I'm using. I am starting with little knowledge of the math that underpins estimation of these models.
Consider the following toy example:
library(lme4)
set.seed(1234)
person <- factor(rep(c("Alice", "Bob", "Catherine", "David"), each = 3))
female <- rep(1:0, each = 3)
myDF <- data.frame(
y = c(1:6, 10:12, 1:3),
person = person,
female = female)
lmer(y ~ (1 + female | person), data = myDF)
Running that example generates these warnings:
Warning messages:
1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
unable to evaluate scaled gradient
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge: degenerate Hessian with 1 negative eigenvalues
Although I've kept the dataset small (n = 12), the convergence warnings don't seem to be due to the small sample. There are more observations than parameters to estimate, and with a small tweak, I can generate the same warnings when (say) n = 1200, while holding constant the number of parameters.
That said, I have only a superficial understanding of what the warnings mean. Can the problems be expressed in terms of the data used in this example -- in terms of the values of y, the values of female, and so on?
There are many posts here about failures to converge when using lmer(). They are useful, but they generally emphasize programming strategies –– use a different optimizer, etc. –– rather than the intuition behind those strategies. Some posts are helpful for developing intuition: for example, "Model failed to converge" warning in lmer() and this non-StackExchange post. But I fear that even those posts don't help me to understand the problem above. I haven't found any sources that work through simple numerical examples like this one.
lmer(y ~ female + (1 | person), data = myDF), the warning goes away. – Emma Jean Jun 03 '20 at 16:00femalepredictor of the group-level intercepts while also adding afemalefixed effect. So the problem that I mentioned in my post doesn't seem due to including thefemalepredictor at the group level. (Whether it's sensible to include that predictor is another question.) – user697473 Jun 04 '20 at 11:45