3

In many applications, it is common to standardize the covariates. This means to center each covariate at its mean and divide by its standard deviation.

Mathematically, how does this affect the design matrix $X$? Suppose $X_s$ is the standardized version of $X$. Does this mean that $X_s^TX_s = X_sX_s^T = I$, i.e., $X_s$ is orthogonal (or orthonormal?)

Adrian
  • 2,869
  • 5
  • 32
  • 53
  • Swapping the order of the matrices in the matrix products will give matrices of different sizes (i.e. $X_s^TX_s \ne X_sX_s^T$). This should tip you off that something about what you wrote is not true. (Indeed, the matrix $X_s$ is not orthogonal, as a simulation will show and give a counterexample.) – Dave Mar 22 '21 at 02:48

1 Answers1

1

Let $x_1$ and $x_2$ be to columns in your design matrix, and let their covariance be $\operatorname{cov}(x_1, x_2) > 0$.

Mean center the variables so that $z_1 = x_1 - \bar{x_1}$ and $z_2 = x_2 - \bar{x_2}$. The covariance between $z_1$ and $z_2$ is

$$ \operatorname{cov}(z_1, z_2) = \dfrac{1}{N-1}\sum(z_{1,i})(z_{2,i}) = \dfrac{1}{N-1} \sum(x_{1, i} - \bar{x}_1)(x_{2, i} - \bar{x}_2) >0$$

since the sample means of the $z_i$ vectors is 0 by construction. Note that

$$ (N-1)\operatorname{cov}(z_1, z_2) = \sum(z_{1,i})(z_{2,i}) = z_1^Tz_2$$

Hence, if the original data have non-zero covariance (i.e. they are not independent), then the standardized versions are not orthogonal.

Thus, $X_s$ can't be orthogonal unless $X$ is orthogonal.