1

When you use the method of least squares you estimate the parameters in the following way:

$$\min_{\mathbf{b}} (\mathbf{y} - \mathbf{X}\mathbf{b})^T(\mathbf{y} - \mathbf{X}\mathbf{b})$$

Where $\mathbf{y}_{n \times 1}$, $\mathbf{X}_{n \times (p + 1)}$ and $\mathbf{b}_{(p+1) \times 1}$

If you solve the problem you obtain the following first order condition:

$$\mathbf{X}^T\mathbf{X}\mathbf{b} = \mathbf{X}^T\mathbf{y}$$

According to R version 4.3.1 to find $\mathbf{b}$ QR decomposition is used (see ?lm in relation to method = "qr"). Thefore we have that if $\mathbf{X}$ is of full column rank it can be expressed as $\mathbf{X} = \mathbf{QR}$ where $\mathbf{Q}$ is a orthogonal matrix with dimensions $n \times (p+1)$ and $\mathbf{R}$ is a upper triangular matrix with dimensions $(p + 1) \times (p + 1)$. According to wikipedia (here, here and here) we have the following:

  • $\mathbf{Q}\mathbf{Q}^T = \mathbf{Q}^T\mathbf{Q} = \mathbf{I}$
  • For $\mathbf{R}$ we have that $r_{ij} = 0$ for $i > j$

Then applying this to the first order condition we have that:

$$\mathbf{X}^T\mathbf{X}\mathbf{b} = \mathbf{X}^T\mathbf{y}$$ $$(\mathbf{QR})^T\mathbf{QR}\mathbf{b} = (\mathbf{QR})^T\mathbf{y}$$ $$\mathbf{R}^T\mathbf{Q}^T\mathbf{QR}\mathbf{b} = \mathbf{R}^T\mathbf{Q}^T\mathbf{y}$$ $$\mathbf{Q}^T\mathbf{QR}\mathbf{b} = \mathbf{Q}^T\mathbf{y}$$ $$\mathbf{Rb} = \mathbf{Q}^T\mathbf{y}$$ $$\mathbf{b} = \mathbf{R}^{-1}\mathbf{Q}^T\mathbf{y}$$

Using the following reproducible example you can see that the above result applies:

model <- lm(mpg ~ wt + hp, data = mtcars)
X <- model.matrix(object = model)
QR <- qr(x = X)
R <- qr.R(qr = QR)
Q <- qr.Q(qr = QR)
y <- as.matrix(mtcars['mpg'])
b <- solve(R) %*% t(Q) %*% y
b
#>                     mpg
#> (Intercept) 37.22727012
#> wt          -3.87783074
#> hp          -0.03177295
coefficients(model)
#> (Intercept)          wt          hp 
#> 37.22727012 -3.87783074 -0.03177295

Created on 2023-11-04 with reprex v2.0.2

Then we have the following result:

$$\mathbf{\hat{y}} = \mathbf{Xb} = \mathbf{X}\mathbf{R}^{-1}\mathbf{Q}^T\mathbf{y} = \mathbf{QR}\mathbf{R}^{-1}\mathbf{Q}^T\mathbf{y} = \mathbf{Q}\mathbf{Q}^T\mathbf{y} = \mathbf{y}$$

Which is totally false because you dont have always that $\mathbf{\hat{y}} = \mathbf{y}$.

Taking into account this I have the following questions:

  • ¿Is really the definition of a orthogonal matrix this $\mathbf{Q}\mathbf{Q}^T = \mathbf{Q}^T\mathbf{Q} = \mathbf{I}$ or it is only this $\mathbf{Q}^T\mathbf{Q} = \mathbf{I}$?
  • If the definition of orthogonal matrix is $\mathbf{Q}\mathbf{Q}^T = \mathbf{Q}^T\mathbf{Q} = \mathbf{I}$ why using the following reproducible example you get this in R version 4.3.1:
model <- lm(mpg ~ wt + hp, data = mtcars)
X <- model.matrix(object = model)
QR <- qr(x = X)
Q <- qr.Q(qr = QR)
dim(Q)
#> [1] 32  3
# t(Q) %*% Q is an identity matrix 
identity_matrix <- t(Q) %*% Q
# Q %*% t(Q) is not the identity matrix
not_an_identity_matrix <- Q %*% t(Q)

Created on 2023-11-04 with reprex v2.0.2

Where Q %*% t(Q) is not an identity matrix but t(Q) %*% Q is an identity matrix

  • 1
    I think $Q^TQ=QQ^T$ only applies to the full QR decomposition, i.e. qr.Q(..., complete = TRUE). I doubt that this is actually used in fitting though. – PBulls Nov 04 '23 at 18:01
  • Hello @PBulls. Can you post your answer not as a comment so I can accept what you mention? – luifrancgom Nov 04 '23 at 18:22
  • Relevant post: https://stats.stackexchange.com/questions/616331/how-to-reconcile-these-two-matrix-equations-for-obtaining-the-coefficients-for-a/616336#616336. Note that $Q$ is column-orthogonal, not fully orthogonal. – Zhanxiong Nov 04 '23 at 19:15
  • 1
    Your algebra is erroneous in several places: track the dimensions of the matrices to find out where your expressions make no sense. (The basic problem is that the limited Wikipedia definition of "orthogonal" is not appropriate here because it is intended for square matrices.) – whuber Nov 04 '23 at 20:18
  • Thak you @whuber. I pointed out in the statement the dimension of each matrix. As you can see there are no errors in what I mention. All the operations between the matrices are well defined. Also thank you about the observation about the part in relation to square matrices. – luifrancgom Nov 04 '23 at 20:58
  • Also @whuber I add a reproducible example in the case of the first part where all the operations make sense. – luifrancgom Nov 04 '23 at 21:05
  • Unless $n=p+1$ it is impossible that both $QQ^\prime$ and $Q^\prime Q$ will be identity matrices, because one of them has rank $n$ and the other has rank $p+1.$ – whuber Nov 05 '23 at 13:02

1 Answers1

0

Taking into account the comments by @PBulls, @Zhanxiong and @whuber I was able to understand the problem I have and I was able to find the following reference The QR decomposition of a matrix that corresponds to the Hyper-Textbook: Optimization Models and Applications by Laurent El Ghaoui.

So there are 2 types of QR decomposition:

  • Case when $X$ is full column rank
  • Full QR decomposition

In the first case for $X$ you have only that $\mathbf{Q}^T\mathbf{Q} = \mathbf{I}$ so you get only the following result: $\mathbf{\hat{y}} = \mathbf{Q}\mathbf{Q}^T\mathbf{y}$

Using the following reproducible example you can see that the above result applies:

model <- lm(mpg ~ wt + hp, data = mtcars)
X <- model.matrix(object = model)
QR <- qr(x = X)
Q <- qr.Q(qr = QR)
y <- as.matrix(mtcars['mpg'])
y_hat <- Q %*% t(Q) %*% y 
fitted_values <- as.matrix(model$fitted.values)

Created on 2023-11-04 with reprex v2.0.2

Where y_hat and fitted_values are the same

The case of the Full QR decomposition is not used in this particular context but you can check this topic for other purposes in The QR decomposition of a matrix > Full QR decomposition.

  • 1
    I don't see the value of the "full QR decomposition" under the regression setting. The insist of finding a square orthogonal $Q$ actually might blur the essential geometry picture of regression. Adding the complete = TRUE option would forcefully expand the original $n \times p$ design matrix $X$ to $n \times n$, hence probably will give you meaningless or misleading results (the only thing what you want to see is that it gives you a $Q$ such that $QQ^T = I_{(n)}$). – Zhanxiong Nov 04 '23 at 23:17
  • 1
    Thank you for the observation. I will delete the part about the "full QR decomposition". – luifrancgom Nov 06 '23 at 02:02