In R, what does anova(f1, f2, f3) compare?

Question

I'm trying to understand the reason why anova(f1, f2, f3) and anova(f1, f2) gives me different result while anova(f1, f2, f3) and anova(f2, f3) gives me the same.

Here's the code:

> data(swiss)
> fit1 <- lm(Fertility ~ Agriculture, data=swiss)
> fit3 <- lm(Fertility ~ Agriculture + Examination + Education, data=swiss)
> fit5 <- lm(Fertility ~ Agriculture + Examination + Education + Catholic + Infant.Mortality, data=swiss)

> class(fit1)
[1] "lm"
> class(fit3)
[1] "lm"
> class(fit5)
[1] "lm"

> anova(fit1, fit3, fit5)
Analysis of Variance Table

Model 1: Fertility ~ Agriculture
Model 2: Fertility ~ Agriculture + Examination + Education
Model 3: Fertility ~ Agriculture + Examination + Education + Catholic + 
    Infant.Mortality
  Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
1     45 6283.1                                  
2     43 3180.9  2    3102.2 30.211 8.638e-09 ***
3     41 2105.0  2    1075.9 10.477 0.0002111 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

> anova(fit1, fit3)
Analysis of Variance Table

Model 1: Fertility ~ Agriculture
Model 2: Fertility ~ Agriculture + Examination + Education
  Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
1     45 6283.1                                  
2     43 3180.9  2    3102.2 20.968 4.407e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

> anova(fit3, fit5)
Analysis of Variance Table

Model 1: Fertility ~ Agriculture + Examination + Education
Model 2: Fertility ~ Agriculture + Examination + Education + Catholic + 
    Infant.Mortality
  Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
1     43 3180.9                                  
2     41 2105.0  2    1075.9 10.477 0.0002111 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

To me, this is read like this:

anova(fit1, fit3, fit5) shows me P value 8.638e-09 for the comparison of f1 and f3.
anova(fit1, fit3) shows me 4.407e-7 for the same comparison. And this doesn't make sense!
anova(fit1, fit3, fit5) and anova(fit3, fit5) shows the same P value for the comparison of fit3 and fit5 and it's 0.0002111.

Probably I missed something. What is it?

The anova() function in R behaves differently depending on what type of object you give it. In other words, what are fit1, fit2, and fit3? What is the result of class(fit1)? — Zoë Clark, Jul 19 '14 at 21:57
They're all lm. Updated question to include code that provides class info. — Minkoo Seo, Jul 20 '14 at 04:35
From the help page of anova: "Normally the F statistic is most appropriate, which compares the mean square for a row to the residual sum of squares for the largest model considered." In addition, see this post. — COOLSerdash, Jul 20 '14 at 06:58
possible duplicate of Logic behind the ANOVA F-test in simple linear regression — COOLSerdash, Jul 20 '14 at 06:59
@COOLSerdash: your comment seems to answer, but I don't see that info contained in the possible duplicate. (Did I miss it?) Care to expand it for a separate answer here? — Nick Stauner, Jul 20 '14 at 07:22
I had another look at this. The reason, why anova(fit1, fit3, fit5) gives another $F$-value for fit3 is the scale argument in anova.lm. The help page states that "If zero this [scale] will be estimated from the largest model considered." So for anova(fit1, fit3, fit5), the scale is estimated to be $2105.043/41\approx 51.34$. Then, the $F$-value for fit3 is calculated as: $(3102.19/2)/51.34\approx 30.211$. I don't want to put this as a full answer as I feel that I don't fully understand why this is done yet. Maybe someone else can shed more light on the rationale of this procedure. — COOLSerdash, Jul 20 '14 at 09:21
@COOLSerdash Thank you for taking a look. But why is it trying to compare fit3 and fit5 for the F-value of anova(fit1, fit3, fit5)? Shouldn't it be comparing fit1 VS fit3 (second row) and fit3 and fit5 (third row)? That's what I don't understand.
Clearly, ?anova shows me this "When given a sequence of objects, anova tests the models against one another in the order specified." And I believe anova.lm should behave in the same way. — Minkoo Seo, Jul 20 '14 at 10:00
Please note that anova when used with models of the class lm is anova.lm. anova is just the generic function. As I have tried to explain: The main difference is in the estimate of the noise variance $\sigma^{2}$. If you include the model fit5, anova uses that model to estimate the noise variance. On the other hand: if only fit1 and fit3 are included, then fit3 is used to estimate the noise variance, resulting in a different $F$ and $p$-value, respectively. — COOLSerdash, Jul 20 '14 at 10:17
Thanks @COOLSerdash. I understand your explanation. What I'm saying is the same. anova() is generic and calls anova.lm() when the parameter is lm class. For that reason, anova.lm should respect the promise that anova has made. And the promise is that fit1 VS fit3 and fit3 VS fit5 are the comparisons. Otherwise, anova.lm violates the interface of anova. So far, I don't understand how anova.lm is doing such comparisons by using fit5's noise for the row of fit3. I frankly start feeling that I should just accept this weirdness. Thank you again. — Minkoo Seo, Jul 20 '14 at 15:49

In R, what does anova(f1, f2, f3) compare?

0 Answers0