0

I'm taking a stats class, and we're working in R and doing a bunch of different types of regression. Where I'm having issues is that we're doing an F-test on nested models (or a partial F-test, I think?)

#Create the complete model
fit_complete <- lm(wage_growth ~ unemployment + gdp + unemployment:gdp + I(unemployment^2) + I(gdp^2), data=economic_subset)

Create the reduced model

fit_reduced <- lm(wage_growth ~ unemployment + gdp + unemployment:gdp, data=economic_subset)

Perform the F-test

anova(fit_complete, fit_reduced)

Which gives me this: ANOVA results

So here's what I'm wondering: my alpha is 0.05, and the p-value is clearly lower than that, so I can reject the null, so the beta for unemployment^2 and/or gdp^2. When I did ANOVAs in my biostats class with treatment groups, we'd run post hoc tests to see which of the groups differed from each other. But for the life of me, I can't figure out how to do that here.

Do I even need to do it here? Would I just include both terms in my final model? Or do I just need to run additional anovas where I only remove one term each time rather than both?

Nick Cox
  • 56,404
  • 8
  • 127
  • 185
  • 2
    I would guess both unemployment and GDP are numeric variables (not categorical variables that have groups). So which groups do you want to compare? Also, checking the significance of predictors to decide whether to include them in the model or not has disadvantages: Why are p-values misleading after performing a stepwise selection?. It'll help to explain what's the goal of this analysis. – dipetkov Jan 26 '23 at 09:25
  • I agree with @dipetkov There are no variables defining groups here, so not only is there no need to think about post hoc comparison of groups: there is no scope to do that. (If you have variables indicating say countries or years and they imply some grouping of data, but that is a different analysis altogether.) A different detail here if you are fitting a quadratic in employment, then unemployment and its square have a joint effect but there is no sense in which one has an effect separate from the other. – Nick Cox Jan 26 '23 at 10:50
  • On the edits: we're human too, but on a forum like CV, just ask your technical question directly. We know very well that questions arise at any level from beginner upwards, so there is no need to explain your personal context. – Nick Cox Jan 26 '23 at 10:52
  • @dipetkov so I'm comparing a complete linear model to a reduced one, and in the reduced one I have removed both unemployment^2 and gdp^2. The null is that the coefficient for both of those is zero, alternative is that at least one of the coefficients isn't. So I guess what I'm confused about is how to tell whether it's unemployment^2, gdep^2, or both – Heather Marie Jan 27 '23 at 01:41
  • This is called feature selection and you could do it by fitting all these models and comparing them. See more in threads on the topic. The important question is why do you do these comparisons. – dipetkov Jan 27 '23 at 08:20

0 Answers0