3

Many posts (i.e. here and here) discuss how the Type III sums of squares produced by car::Anova() in R are incorrect or nonsensical under R's default model parameterization, "contr.treatment", in which the first level of the categorical variable is set as the reference/intercept and each remaining level is compared to it. One way to obtain the "correct" Type III sums of squares is to change the model parameterization to sum coding, "contr.sum".

What remains unclear to me, however, is whether there might be situations in which the default "incorrect" Type III sums of squares under the "contr.treatment" parameterization might be useful. What is the interpretation of these sums of squares for the main effects, and are they always nonsensical and useless?

Below is some example R code for illustration:

# Random data for 2-way balanced ANOVA design
set.seed(2964)
df <- data.frame(response = rnorm(n = 32, mean = seq(10, 25, 5)), 
             varA = factor(rep(paste0("A", 1:4), times = 4)),
             varB = factor(rep(paste0("B", 1:2), each = 8)))

Type I Sums of Squares

anova(lm(response ~ varA*varB, data = df))

Default "incorrect" Type III Sums of Squares

car::Anova(lm(response ~ varA*varB, data = df), type = "III", test.statistic = "F")

"Correct" Type III Sums of Squares produced under sum coding

These match the Type I Sums of Squares.

car::Anova(lm(response ~ varA*varB, data = df, contrasts = list(varA = contr.sum, varB = contr.sum)), type = "III", test.statistic = "F")

Type I Sums of Squares

Analysis of Variance Table

Response: response Df Sum Sq Mean Sq F value Pr(>F)
varA 3 991.97 330.66 330.6870 < 0.0000000000000002 *** varB 1 0.15 0.15 0.1494 0.70247
varA:varB 3 9.23 3.08 3.0783 0.04666 *
Residuals 24 24.00 1.00


Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Default "incorrect" Type III Sums of Squares

Anova Table (Type III tests)

Response: response Sum Sq Df F value Pr(>F)
(Intercept) 322.18 1 322.2059 0.000000000000002052 *** varA 545.68 3 181.9090 < 0.00000000000000022 *** varB 6.03 1 6.0350 0.02164 *
varA:varB 9.23 3 3.0783 0.04666 *
Residuals 24.00 24


Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

"Correct" Type III Sums of Squares produced under sum coding

These correctly match the Type I Sums of Squares for this balanced case.

Anova Table (Type III tests)

Response: response Sum Sq Df F value Pr(>F)
(Intercept) 9666.9 1 9667.7398 < 0.0000000000000002 *** varA 992.0 3 330.6870 < 0.0000000000000002 *** varB 0.1 1 0.1494 0.70247
varA:varB 9.2 3 3.0783 0.04666 *
Residuals 24.0 24


Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Darren James
  • 1,231

0 Answers0