3

I have a dataset on green-up days pan-Arctic and relate this to weather variables. I therefore want to use a linear mixed effect model where region (e.g. Alaska, Yukon, Northwest Territories, Eastern Siberia etc.) is the random effect, and weather variables are fixed effects. I am looking for, how the weather influences the day of green-up (given in day of the year). I know there is some random effect involved from spatial differences across the Arctic.

library(lme4)
reg.result = lmer(greenup ~ precip+temp+windspeed+relativehumidity+(1+precip+temp+windspeed+relativehumidity|region), data = greenup, REML = F)

I do not want to have variable intercept but fixed slope for each random effect. So I add the term (1+precip+temp+windspeed+relativehumidity|region). This gives me different intercepts and slopes for each random effect (region).

Alternatively, I could just perform a linear regression per region e.g.

Alaska <- lm(greenup ~ precip+temp+windspeed+relativehumidity, data = AK)
Yukon <- lm(greenup ~ precip+temp+windspeed+relativehumidity, data = YK)
NWT <- lm(greenup ~ precip+temp+windspeed+relativehumidity, data = NWT)
...
ESib <- lm(greenup ~ precip+temp+windspeed+relativehumidity, data = ESib)

My question is then, why would I not just make multiple different linear regressions for each region and thereby take out this random effect myself?

Would it give me a different result?

Thomas
  • 528
  • I am not really sure about what effects you are interested in. Almost any formula could make sense depending on what you are after. Also, could you clarify what your alternative of "multiple different linear regressions" would look like? Could you add the formulae you envision for this alternative approach? – frank Sep 29 '22 at 09:25
  • Sorry, I added some details. – Thomas Sep 29 '22 at 09:33

1 Answers1

1

The difference between your random effects (RE) approach and your multiple model (MM) approach, is that with MM each model doesn't use any data from the other regions, while with RE the models also use the data belonging to the other regions. So if the data for say AK would result in learned linear coefficients that are very different from those for the other regions, those coefficients would be adjusted to be "more equal" to those of the other regions.

So you should use random effects if you think that the submodels for all the different levels (in your case the levels are the regions) are similar.

This is particularly helpful if a certain region, e.g. NWT, has only very little data. With RE you then borrow information from the other regions and use it for NWT, too.

By the way, what do you think is it that changes your values of greenup in different regions if it is not the different weather in those regions?

frank
  • 10,797
  • Okay, that is interesting, but all regions have the same number of observations (n = 40). Would you still recommend a RE model then? I do not know, it could be the dominant tree species may respond differently to changes in weather. – Thomas Oct 03 '22 at 15:22
  • Yes, if you think that the dependence of greenup on the weather features should be similar across the regions, then you should use RE, also if you have the same number of observations in all regions. It might not make a big difference, you would have to try and check. – frank Oct 04 '22 at 02:50
  • I am still a bit confused about this. When I use a RE like this lmer(V1~V2+V3+(1+V2+V3|V4), I basically get different slopes/coefficients per region which would be equal to a linear model for each region, right? – Thomas Oct 04 '22 at 14:56
  • Yes, but the coefficients you get for RE are in general different from the respective coefficients of MM. – frank Oct 04 '22 at 15:10
  • Okay, thanks a lot. Do you know, if I can compare the intercepts and slopes from each region to one another and thereby determine significant differences between regions and their respective greenup day and how the weather influences differently? – Thomas Oct 04 '22 at 15:14
  • Comparing model coefficients is quite a different matter. Please don't extend your original question, rather, ask a new one. Anyways, maybe this helps: https://stats.stackexchange.com/questions/93540/testing-equality-of-coefficients-from-two-different-regressions – frank Oct 04 '22 at 15:29
  • I actually did here, sorry. https://stats.stackexchange.com/questions/591055/how-do-i-check-statistical-difference-in-effect-size-of-the-random-variable-from – Thomas Oct 04 '22 at 15:33