When doing effect/sum coding in linear regression (which AFAIK are the same), the contrasts are coded as:
| condition1 | condition2 | condition3 | |
|---|---|---|---|
| condition1 | 1 | 0 | 0 |
| condition2 | 0 | 1 | 0 |
| condition3 | 0 | 0 | 1 |
| condition4 | -1 | -1 | -1 |
Then, in the output of the model in e.g. R, we only get confidence intervals and p-values for conditions 1-3, and if we want a CI for condition 4, we have to run the model again with a different condition taking the -1 row.
My question is, why can't we just have something like:
| condition1 | condition2 | condition3 | condition4 | |
|---|---|---|---|---|
| condition1 | 1 | 0 | 0 | 0 |
| condition2 | 0 | 1 | 0 | 0 |
| condition3 | 0 | 0 | 1 | 0 |
| condition4 | 0 | 0 | 0 | 1 |
If we're able to get a CI for all four conditions' difference from the grand mean using two models, why can't we just make a model that does that the first time? Why do we have to treat one condition differently from the rest, when effect coding is supposedly a symmetrical model (unlike dummy coding)?