Why different coding gives the same sums of square?

Question

I know that we can use different coding scheme for a factorial experiment. By changing the coding inside your linear model, you change the hypothesis of each coefficient. eg for dummy coding you will test difference of level with base line and for a different coding the coefficient estimates would be different. Why does the sums of squares not change when changing the coding? A deeper question is, if your design matrix is non-orthogonal how can you get a unique decomposition of sums of square?

I have a good feeling both these questions are strongly related. Let me write an R example to give some context.

#set-up
factor<-gl(4,4) 
response<-rnorm(16)

#fitting model 1
model1<-lm(response~factor) #using contr.treatment (default)
M1<-model.matrix(model1) #model matrix

#fitting model 2
contrasts(factor)=contr.sum(4) #changing contrasts to contr.sum
model2<-lm(response~factor) #using contr.sum 
M2<-model.matrix(model2) #model matrix

#anova for both models
anova(model1)

> Response: response
          Df  Sum Sq Mean Sq F value Pr(>F)
factor     3  1.1735 0.39118  0.1891 0.9018
Residuals 12 24.8234 2.06861     

anova(model2)

> Response: response
          Df  Sum Sq Mean Sq F value Pr(>F)
factor     3  1.1735 0.39118  0.1891 0.9018
Residuals 12 24.8234 2.06861     

#model matrix for both models
> M1
   (Intercept) factor2 factor3 factor4
1            1       0       0       0
2            1       0       0       0
3            1       0       0       0
4            1       0       0       0
5            1       1       0       0
6            1       1       0       0
7            1       1       0       0
8            1       1       0       0
9            1       0       1       0
10           1       0       1       0
11           1       0       1       0
12           1       0       1       0
13           1       0       0       1
14           1       0       0       1
15           1       0       0       1
16           1       0       0       1

> M2
   (Intercept) factor1 factor2 factor3
1            1       1       0       0
2            1       1       0       0
3            1       1       0       0
4            1       1       0       0
5            1       0       1       0
6            1       0       1       0
7            1       0       1       0
8            1       0       1       0
9            1       0       0       1
10           1       0       0       1
11           1       0       0       1
12           1       0       0       1
13           1      -1      -1      -1
14           1      -1      -1      -1
15           1      -1      -1      -1
16           1      -1      -1      -1

As you can see, the M2, which is the model matrix with contr.sum contrasts, has orthogonal columns ,whereas, M1,which is the model matrix with contr.treatment contrasts, has non-orthogonal columns, yet the anova table is still identical. M1 is non-orthogonal for example (Intercept). factor2= 4.

The answers to both questions are geometrically obvious, so some study of geometric descriptions of multiple regression could be handy here. — whuber, Aug 17 '16 at 15:13
Why does the sums of squares not change when changing the coding? Look at some pictures explaining multiple regression like these. There R-square of the prediction (which is the standardised SS of the prediction, the length of $\hat Y$) is the angle between Y and the predictor "plane X" spanned by the two predictors. Predictors X could be turned any way as you like, - but if they still span the same plane X as before - R-square won't change. — ttnphns, Aug 17 '16 at 15:47
(cont.) ANOVA with k-category factor is just the regression with k-1 contrast predictors. With different types of contrast coding different bundles of predictors will occur, but they will still define the same plane X after centering. Coefficients will change, SS and R-square will not. — ttnphns, Aug 17 '16 at 15:47
If your design matrix is non-orthogonal how can you get a unique decomposition of sums of square? but that is how linear regression works with correlated predictors. In ANOVA, the linear-regression strategy of decomposing SS of effects is called type III SS. — ttnphns, Aug 17 '16 at 15:47

Why different coding gives the same sums of square?

0 Answers0