0

I've asked a similar question here in CV but because I'm still unsure what my options are I've rephrased the question slightly. I have a very simple model with no random slopes:

y <- rnorm(7000, 0, 1)
x <- rep(c("A","B"), each=700, times=10)
g <- rep(c("g1", "g2", "g3", "g4", "g5", "g6", "g7", "g8", 
          "g9", "g10"), each=7000)

df <- data.frame(y=y, x=x, g=g)

m <- lmer(y ~ x + (1|g), data=df) boundary (singular) fit: see ?isSingular

ranef(m) $g (Intercept) g1 0 g10 0 g2 0 g3 0 g4 0 g5 0 g6 0 g7 0 g8 0 g9 0

summary(m) shows the exact same information as summary(lm(y ~ x, data=df)). A mixed model does not seem appropriate here since ranef(m) is all zeros, so my question is whether it's justified to run a simple lm on these data lm(y ~ x, data=df).

locus
  • 1,593

1 Answers1

0

Your example uses completely "random" data where all of the variables are independent of each other. There is nothing to estimate here. Not only the two models you tried will be the same, but they would also be the same as the intercept-only model lm(y ~ 1). If your models would find any non-zero parameters for any of the independent variables included in the model, you would be overfitting the data used for fitting the model, the results would be bogus. The results you see are correct.

Tim
  • 138,066
  • Thanks for your answer @Tim. With my actual data I'm also getting the same warning message and ranef(mydata) also shows values close to zero. However, my data are not random. Are there any tools I can use to figure out why I'm getting a singular fit? It cannot be complexity since my model is very simple (like the one in my example). – locus May 10 '22 at 12:34
  • @locus maybe, the same as in the example, in your data the variable used as random effect grouping is independent of the dependent variable? Is there any correlation? What do you see when you plot it? Based on the theory, are you sure that there should be dependence? – Tim May 10 '22 at 12:47
  • Thanks for your help Tim. You mentioned I should look at the correlation between the variable used as random effect grouping and the dependent variable. However, the random effect grouping variable is categorical (in the example above that would be g), how can I compute the correlation? – locus May 10 '22 at 14:45
  • @locus e.g. color some other plots by the categories – Tim May 10 '22 at 14:50
  • I checked my data and there are indeed very strong correlations, the majority of my gs show a positive relationship with the dependent variable. It seems that the problem is rather the number of time points within each g. I posted another question here: https://stats.stackexchange.com/questions/575088/boundary-singular-fit-due-to-many-time-points-per-id – locus May 13 '22 at 07:30