0

I have a question regarding contrasts in lm() in R and their relation to generalized inverses. In this post, the second answer indicates that the generalized inverse of the coding matrix, which specifies how the design matrix should be modified to test contrasts of interest, is taken with an intercept column added afterwards, which is then passed to lm() as the contrast matrix. The key to this is that the design matrix $X$ becomes $XC^{-1}$, where $C^{-1}$ is what is passed to the function, and then the least squares estimator is computed as usual. It actually seems to me you don't need to use ginv() because you can just specify a square contrast matrix?

My understanding from a textbook, however, is that the generalized inverse is needed when using the overparameterized model, with parameters $\mu,\tau_1,\tau_2,\tau_3$ for data with only 3 groups -- the matrix $X'X$ therefore is not full rank, and the generalized inverse of $X'X$ is what is taken. Then using the parameter estimates obtained, a linear combination of parameters that has the same answer regardless of the generalized inverse used is an estimable function. So in that context there is not really any talk of specifying contrasts, you simply estimate the model and then take linear combinations of the resulting parameters. In this case, the estimate of $\tau_1-\tau_2$ from any model will be exactly the same regardless of the generalized inverse used and regardless of which parameter is constrained to be zero -- this I find strange in comparison to the method of specifying explicit contrasts, modifying the design matrix, and then estimating with least squares as normal.

Can anyone explain the equivalence between these approaches, particularly the use of the generalized inverse? In the "overparameterized model" case, I don't really see why you need to specify a non-full rank $X'X$ and take a generalized inverse, because you modify $X$ manually.

I know there are posts about these topics already, but none of them seem to address my specific question.

fmt
  • 1

0 Answers0