In Multiple Regression, how to show the dimension of set $C$ of estimable functions?

Question

$C$ is defined as the set of ($k\times 1$) vectors $c$ such that $c^{\top}\beta$ is estimable.

Today I learned that estimable is defined such that the condition must hold: the expectation of a linear combination of the $Y$ equals to a linear combination of $\beta.$

I need to show that the dimension of $C$ is equal to the $\operatorname{rank}(X) = k$. Here, $X$ is the matrix of the regressors.

Regarding this exercise, I have two questions:

What is actually meant with the dimension of a set?
How can I show that the dimension of $C$ is equal to the $\operatorname{rank}(X) = k$?

Although "dimension" has many different characterizations and definitions in mathematics, in any context referring to linear combinations it is clear that vector space dimension is intended. It is, equivalently, the smallest cardinality of a spanning set or the largest cardinality of a linearly independent set. — whuber, Jan 11 '23 at 01:09
It looks like your $X$ is $n \times k$ (because $\beta$ is in $\mathbb{R}^k$). In this case $\operatorname{rank}(X) = k$ means $X$ is of full column rank, under which case every $c'\beta$ is estimable, hence $C = \mathbb{R}^k$ and $\dim(C) = k$. A more interesting case would be the rank of $X$ is smaller than the number of regressors. In other words, what is more interesting is that $\beta \in \mathbb{R}^p$, $p > k$. The goal is to prove $\dim(C) = k$. — Zhanxiong, Jan 11 '23 at 02:34

score 4 · Answer 1 · edited Jan 14 '23 at 17:43

What OP has encountered is nothing but the concept of estimation space aka $\mathcal C(\mathbf X)$ (cf. $\rm [I]$).

Consider an orthornormal basis of $\mathcal C(\mathbf X): ~\langle \boldsymbol\alpha_j\rangle_{j=1}^k.$ Observe

\begin{align}\mathbb E\left[\boldsymbol\alpha_j^\top\mathbf y\right] &=\boldsymbol\alpha_j^\top[\boldsymbol\alpha_1~ \boldsymbol\alpha_2 \cdots ~\boldsymbol\alpha_j ~\cdots~\boldsymbol\alpha_k ]\boldsymbol\beta \\ &= [ 0~0\cdots~ 1~\cdots ~0]\boldsymbol\beta\\ &= \beta_j\tag 1\label 1;\end{align}

For a certain $\boldsymbol{c} = (c_1,\ldots,c_k)\in \mathbb R^k,$ from $\eqref 1, ~\mathbb E\left[c_j\left(\alpha_j^\top\mathbf y\right)\right] = c_j\beta_j.$ Therefore

$$\mathbb E\left[\underbrace{\left(\sum_{j=1}^k c_j\alpha_j\right)^\top}_{\equiv\boldsymbol \ell^\top}\mathbf y\right] =\boldsymbol c^\top\boldsymbol\beta;\tag 2\label 2 $$

$\eqref 2$ implies $\boldsymbol \ell\in \operatorname{span}\{\boldsymbol\alpha_1,~ \boldsymbol\alpha_2, \cdots, ~\boldsymbol\alpha_j, ~\cdots,~\boldsymbol\alpha_k\}.$ For any linear function of parameters to be estimable, there must be $\boldsymbol \ell^\top\mathbf y$ such that $\boldsymbol\ell \in \mathcal C(\mathbf X)$ for which $\eqref 2$ holds.

Hence, the dimension of '$C$' is $k$ as it corresponds to the estimation space.

References:

$\rm [I]$ Plane Answers to Complex Questions: The Theory of Linear Models, Ronald Christensen, Springer Science$+$Business Media, $2011,$ sec. $3.1,$ p. $49$.

$\rm [II]$ Analysis of Variance, Henry Scheffé, John Wiley & Sons, $1959,$ sec. $1.6,$ p. $23.$

Zhanxiong · Accepted Answer · 2023-01-11T05:52:29.033

$\DeclareMathOperator{\rank}{rank}$

Per my comment under the question, I am assuming a more interesting case that $X \in \mathbb{R}^{n \times p}$ with $p > \rank(X) = k$. In this case, $C$ is a set of $p \times 1$ vectors (rather than $k \times 1$ vectors as you stated).

First of all, your definition of estimability

"The expectation of a linear combination of $Y$ equals to a linear combination of $\beta$."

can be more formally stated as:

$c'\beta$ is estimable if for any $\beta \in \mathbb{R}^p$, there exists $v \in \mathbb{R}^n$ such that $c'\beta = E(v'Y)$.

And it is equivalent to the following definition:

$c'\beta$ is estimable if for any $\beta_1 \in \mathbb{R}^p, \beta_2 \in \mathbb{R}^p$ such that $X\beta_1 = X\beta_2$, it holds that $c'\beta_1 = c'\beta_2$.

For reference convenience, let me call your definition "Definition I" and my definition "Definition II". I prefer Definition II for its intuitiveness -- it simply expressed that $c'\beta$ is uniquely determined by the image of $\beta$ through $X$, hence it is "estimable" -- i.e., "identifiable" under the linear model setting. As it will be clear in the proof below, adopting Definition II seems also easier for solving your question. For more background on estimability of the linear functional $c'\beta$, check this answer.

To begin with, you should first show that the space $C = \{c \in \mathbb{R}^p: c'\beta \text{ is estimable}\}$ is indeed a subspace of $\mathbb{R}^p$. This is straightforward by verifying (it's sufficient to check the definition in the paragraph above):

$0 \in C$.
If $c_1 \in C, c_2 \in C$, then the linear combination of $c_1$ and $c_2$ is also in $C$.

Once you have verified that $C$ is a subspace, it makes sense to talk about its dimension: any subspace of a finite-dimensional vector space has its dimension -- this is something very basic in linear algebra.

In fact, if denote by $N(A)$ and $R(A)$ the null space and the range space of a matrix $A$ respectively, it can be shown that $C = N^\perp(X) = R(X')$, whence $$\dim(C) = \dim(N^\perp(X)) = \dim(R(X')) = \rank(X) = k.$$ Note that $N^\perp(X) = R(X')$ is a general well-known linear algebra result (a proof can be found at the end of this answer), so it suffices to prove $C = N^\perp(X)$. Here goes the proof:

Suppose $c \in C$. For any $\beta \in N(X)$, $X\beta = 0 = X0$, which implies $c'\beta = c'0 = 0$, i.e., $c \perp \beta$, or $c \in N^\perp(X)$. This shows $C \subset N^\perp(X)$.

Conversely, if $\gamma \in N^\perp(X)$, then for any $\beta \in N(X)$ we have $\gamma'\beta = 0$. If $\beta_1, \beta_2 \in \mathbb{R}^p$ are such that $X\beta_1 = X\beta_2$, then $X(\beta_1 - \beta_2) = 0$, i.e., $\beta_1 - \beta_2 \in N(X)$. Therefore, $\gamma'(\beta_1 - \beta_2) = 0$, or $\gamma'\beta_1 = \gamma'\beta_2$, i.e., $\gamma \in C$. This shows $N^\perp(X) \subset C$.

In summary, $C = N^\perp(X)$. This completes the proof.

Proof of the equivalence of Definition I and Definition II.

Definition I implies Definition II. Since $E(Y) = X\beta$, if for any $\beta \in \mathbb{R}^p$, $c'\beta = E(v'Y)$ holds for some $v \in \mathbb{R}^n$, then $c'\beta = v'X\beta$. Hence if $X\beta_1 = X\beta_2$, it must hold $c'\beta_1 = c'\beta_2$.

Definition II implies Definition I. With Definition II, it has been shown above that if $c'\beta$ is estimable, then $c \in R(X')$, i.e., there exists $v \in \mathbb{R}^n$ such that $c = X'v$. Therefore, for any $\beta \in \mathbb{R}^p$, $c'\beta = v'X\beta = E(v'Y)$, which is Definition I.

Proof of $N^\perp(X) = R(X')$. Suppose $y \in R(X')$, then $y = X'\alpha$ for some $\alpha \in \mathbb{R}^n$. Hence for any $\beta \in N(X)$, $y'\beta = \alpha'X\beta = \alpha 0 = 0$. This shows $y \in N^\perp(X)$, i.e., $R(X') \subset N^\perp(X)$. On the other hand, by rank-nullity theorem, $$\dim(R(X')) = \dim(R(X)) = p - \dim(N(X)) = \dim(N^\perp(X)).$$

Combining this with $R(X') \subset N^\perp(X)$, it follows that $N^\perp(X) = R(X')$.

In Multiple Regression, how to show the dimension of set $C$ of estimable functions?

2 Answers2

References: