The contrast matrix you use is not square, it does not have a regular inverse. You need to use the Moore-Penrose (pseudo-)inverse of the contrast matrix, which is calculated by MASS::ginv().
Starting from you contrast matrix, what you want to calculate is:
\begin{equation}
\begin{pmatrix}
1 & -0.5 & -0.5 \\
1 & 0 & -1 \\
\end{pmatrix}\
\begin{pmatrix}\mu_{a} \\\mu_{b} \\\mu_{c}\end{pmatrix}
= \begin{pmatrix} \mu_{a}-\frac{(\mu_{b}+\mu_{c})}{2} \\\mu_{a}-\mu_{c} \end{pmatrix}
\end{equation}
Which is the comparisons you want, right?
If this is the case, your contrast matrix is actually
t(contr_mat)
[,1] [,2] [,3]
[1,] 1 -0.5 -0.5
[2,] 1 0.0 -1.0
So why do you need to take the inverse?
Let's construct the inverse, with an added intercept:
C1 <- cbind(1, ginv(t(contr_mat)))
round(C1, 2)
[,1] [,2] [,3]
[1,] 1 0.67 0
[2,] 1 -1.33 1
[3,] 1 0.67 -1
If we invert C1 back, we get:
round(ginv(C1), 2)
[,1] [,2] [,3]
[1,] 0.33 0.33 0.33
[2,] 1.00 -0.50 -0.50
[3,] 1.00 0.00 -1.00
Which is your original contrast matrix, with an added grand mean row:
\begin{equation}
C\mu = \begin{pmatrix}
1/3 & 1/3 & 1/3 \\
1 & -0.5 & -0.5 \\
1 & 0 & -1 \\
\end{pmatrix}\
\begin{pmatrix}
\mu_{a} \\\mu_{b} \\\mu_{c}\end{pmatrix}
= \begin{pmatrix}
\frac{\mu_{a}+\mu_{b}+\mu_{c}}{3} \\
\mu_{a}-\frac{(\mu_{b}+\mu_{c})}{2} \\
\mu_{a}-\mu_{c} \end{pmatrix}
\end{equation}
So we took your original contrast matrix, transposed it, and took the pseudoinverse, added an intercept. If we invert this matrix, we recover your contrasts.
Why did we invert your matrix?
Note that the matrix we built above (with C1 <- cbind(1, ginv(t(contr_mat))) ) is square and full rank: you can take the regular inverse of this matrix, just as for C. Let's then call it $C^{-1}$, because a property of a regular inverse is:
$C^{-1} C = CC^{-1} = I$, with $I$ the identity matrix.
The model supporting your comparisons and used with the lm() function can be expressed as:
$y = X\mu + \epsilon$
Where X is the design matrix, $\mu$ the vector of means, and $\epsilon$ the random component with normal distribution.
If X is the design matrix for the cell means model, lm() calculates the means using the least square equation:
$\hat{\mu} =(X^{\prime }X)^{-1}\ X^{\prime }y\\$
Where $X^{\prime}$ is the transpose of $X$.
If we now modify the design matrix and coefficients to incorporate C, your contrast matrix, we get:
\begin{equation}
y = X\mu + \epsilon = XI\mu + \epsilon = X(C^{-1}C)\mu + \epsilon = (XC^{-1})(C\mu) + \epsilon
\end{equation}
So you can use the modified design matrix $(XC^{-1})$ to evaluate your contrasts $(C\mu)$, using the least square method, in the context of your original model.
If you use the modified design matrix $(XC^{-1})$, the least squares equations become:
\begin{equation}
\hat{C\mu} =[(XC^{-1})^{\prime }XC^{-1}]^{-1}\ (XC^{-1})^{\prime }y\\
\end{equation}
And you get your desired contrasts $C\hat{\mu}$.
That's why you need to calculate the inverse of C, to evaluate the contrasts $C\mu$.
To show that it works in R with your example:
x = sample(letters[1:3], size = 300, replace = T)
y = rnorm(300)
contr_mat = matrix(c(1, -0.5, -0.5,
1, 0, -1),
nrow = 3, ncol = 2)
Let's build the modified design matrix:
X <- model.matrix(~x + 0) # model matrix for the cell means model
X1 <- X %*% cbind(1,ginv(t(contr_mat))) # this is the modified model matrix, using the pseudoinverse.
And we solve by the method of least squares or by lm(). lm() takes the inverse of your contrast matrix as argument and adds the intercept term, then evaluates the contrasts by the least squares method:
# least-squares equation:
solve ( t(X1) %% X1 ) %% t(X1) %*% y
[,1]
[1,] -0.06524626
[2,] 0.03627632
[3,] 0.04820965
lm() with modified design matrix
lm(y ~ X1 +0 ) %>% summary
Call:
lm(formula = y ~ X1 + 0)
Residuals:
Min 1Q Median 3Q Max
-3.1953 -0.7073 0.0250 0.7382 2.7251
Coefficients:
Estimate Std. Error t value Pr(>|t|)
X11 -0.06525 0.06049 -1.079 0.282
X12 0.03628 0.12596 0.288 0.774
X13 0.04821 0.14336 0.336 0.737
lm() with your contrasts
lm(y ~ x, contrasts = list(x = ginv(t(contr_mat)))) %>% summary
Call:
lm(formula = y ~ x, contrasts = list(x = ginv(t(contr_mat))))
Residuals:
Min 1Q Median 3Q Max
-3.1953 -0.7073 0.0250 0.7382 2.7251
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.06525 0.06049 -1.079 0.282
x1 0.03628 0.12596 0.288 0.774
x2 0.04821 0.14336 0.336 0.737
The last two rows contain the comparisons you want:
means <- tapply(X = y, INDEX = factor(x), FUN = mean)
C %*% means
[,1]
[1,] -0.06524626
[2,] 0.03627632
[3,] 0.04820965
(means[1] + means[2] + means[3])/3
-0.06524626
means[1] - (means[2] + means[3])/2
0.03627632
means[1] - means[3]
0.04820965