I know that we can use different coding scheme for a factorial experiment. By changing the coding inside your linear model, you change the hypothesis of each coefficient. eg for dummy coding you will test difference of level with base line and for a different coding the coefficient estimates would be different. Why does the sums of squares not change when changing the coding? A deeper question is, if your design matrix is non-orthogonal how can you get a unique decomposition of sums of square?
I have a good feeling both these questions are strongly related. Let me write an R example to give some context.
#set-up
factor<-gl(4,4)
response<-rnorm(16)
#fitting model 1
model1<-lm(response~factor) #using contr.treatment (default)
M1<-model.matrix(model1) #model matrix
#fitting model 2
contrasts(factor)=contr.sum(4) #changing contrasts to contr.sum
model2<-lm(response~factor) #using contr.sum
M2<-model.matrix(model2) #model matrix
#anova for both models
anova(model1)
> Response: response
Df Sum Sq Mean Sq F value Pr(>F)
factor 3 1.1735 0.39118 0.1891 0.9018
Residuals 12 24.8234 2.06861
anova(model2)
> Response: response
Df Sum Sq Mean Sq F value Pr(>F)
factor 3 1.1735 0.39118 0.1891 0.9018
Residuals 12 24.8234 2.06861
#model matrix for both models
> M1
(Intercept) factor2 factor3 factor4
1 1 0 0 0
2 1 0 0 0
3 1 0 0 0
4 1 0 0 0
5 1 1 0 0
6 1 1 0 0
7 1 1 0 0
8 1 1 0 0
9 1 0 1 0
10 1 0 1 0
11 1 0 1 0
12 1 0 1 0
13 1 0 0 1
14 1 0 0 1
15 1 0 0 1
16 1 0 0 1
> M2
(Intercept) factor1 factor2 factor3
1 1 1 0 0
2 1 1 0 0
3 1 1 0 0
4 1 1 0 0
5 1 0 1 0
6 1 0 1 0
7 1 0 1 0
8 1 0 1 0
9 1 0 0 1
10 1 0 0 1
11 1 0 0 1
12 1 0 0 1
13 1 -1 -1 -1
14 1 -1 -1 -1
15 1 -1 -1 -1
16 1 -1 -1 -1
As you can see, the M2, which is the model matrix with contr.sum contrasts, has orthogonal columns ,whereas, M1,which is the model matrix with contr.treatment contrasts, has non-orthogonal columns, yet the anova table is still identical. M1 is non-orthogonal for example (Intercept). factor2= 4.
Why does the sums of squares not change when changing the coding?Look at some pictures explaining multiple regression like these. There R-square of the prediction (which is the standardised SS of the prediction, the length of $\hat Y$) is the angle between Y and the predictor "plane X" spanned by the two predictors. Predictors X could be turned any way as you like, - but if they still span the same plane X as before - R-square won't change. – ttnphns Aug 17 '16 at 15:47If your design matrix is non-orthogonal how can you get a unique decomposition of sums of square?but that is how linear regression works with correlated predictors. In ANOVA, the linear-regression strategy of decomposing SS of effects is called type III SS. – ttnphns Aug 17 '16 at 15:47