The dummy variable trap - including a dummy variable for every category and including a constant term in the regression together guarantees perfect multicollinearity - is most commonly resolved by dropping out one of the dummies.
However, I was also told that an equivalent alternative approach will be to add a constraint such that the sum of the coefficients corresponding to ALL of the dummy variables equals zero $(\sum_{i \in Category} \beta_i = 0)$.
I am having trouble proving this claim - prediction (in OLS sense, $X\beta$) from both approaches are equivalent.
This is my attempt
We know that coefficients of Restricted OLS are $\beta^{RLS} = \beta^{OLS} - (X^TX)^{-1}R^T[R(X^TX)^{-1}R^T]^{-1}(R\beta^{OLS} - r)$
We can formulate the "drop one out" solution (for a categorical variable with 3 groups) as below.
$min_{\beta} {(Y-X\beta)^2} \\
s.t. \beta_{Group 3} = 0$
where $\beta = [\beta_{intercept}, \beta_{Group 1}, \beta_{Group 2}, \beta_{Group 3} ]$
We can express the constraint of $\beta_{Group 3} = 0$ as
$R = [0, 0, 0, 1]$ and $r = 0$
The other alternative proposal to handle dummy variable trap, sum of coefficients of the dummy variable equal zero, can be expressed as:
$R = [0, 1, 1, 1]$ and $r = 0$
I need to show that $X\beta^{RLS, drop-one-out} == X\beta^{RLS, sum=0}$.
$X\beta^{RLS} = X(X^TX)^{-1}X^TY - X(X^TX)^{-1}R^T[R(X^TX)^{-1}R^T]^{-1}R\beta^{OLS} \hspace{5mm}\text{as r = 0 in both cases}$
I tried proceeding with 2 different simplifications
$$\begin{align}
X\beta^{RLS} &= X(X^TX)^{-1}[X^TY - R^T[R(X^TX)^{-1}R^T]^{-1}R\beta^{OLS} \\
&= X(X^TX)^{-1}[X^TY - R^T[R(X^TX)^{-1}R^T]^{-1}R(X^TX)^{-1}X^TY \\
&= X(X^TX)^{-1}[I - R^T[R(X^TX)^{-1}R^T]^{-1}R(X^TX)^{-1}]X^TY \\
\end{align}$$
Given $R$ is not square and hence, not invertible in the regular sense, I was not able to proceed to show that $X\beta^{RLS}$ will be equal for the 2 different constraint matrix, R.
The second simplification that I tried was
$$\begin{align} X\beta^{RLS} &= X(X^TX)^{-1}X^TY - X(X^TX)^{-1}R^T[R(X^TX)^{-1}R^T]^{-1}R\beta^{OLS} \\ &= X(X^TX)^{-1}X^TY - X(X^TX)^{-1}R^T[R(X^TX)^{-1}R^T]^{-1}R(X^TX)^{-1}X^TY \\ &= X(X^TX)^{-1}X^TY - D^T[R(X^TX)^{-1}R^T]^{-1}DY \\ &\text{where } D=R(X^TX)^{-1}X^T\\ \end{align}$$
And again I can't see how I could continue to prove that the prediction from Linear Regression, $X\beta^{RLS}$, is equal under the 2 different constraint matrices 1) $\beta_{Group3} = 0$ and 2) $\beta_{Group1} + \beta_{Group2} + \beta_{Group3} = 0$.
Any help is much appreciated. Thank you.