0

We have the following sample containing two predictors ($x_1, x_2$) and one dependent variable ($y$).

$x_1=[-1.01, 3.23, 5.49, 0.23, -2.87, 3.67]$

$x_2=[-0.99, 3.25, 5.55, 0.21, -2.91, 3.76]$

$y=[-1.89, 10.33, 19.09, 2.19, -8.09, 11.29]$

I performed a PLS regression on this data and I obtained the x-scores (a matrix $T$), the x-loads (a matrix $P$), the y-scores (a matrix $U$) and the y-loads (a matrix $Q$). From what I've read, the predicted values should be $\hat{y}=T\cdot Q^{T}$ (by $A^{T}$ we denote the transpose of a matrix A). In other words, we can estimate $y$ using the extracted factors (the columns of $T$) while using the values of $Q^T$ as some kind of regression coefficients. However, my output also shows the predicted/estimated values (i.e. $\hat{y}$) which are different from the ones that would be obtained by multiplying the matrices $T$ and $Q^{T}$.

Hence, which is the regression equation that predicts the values of $y$? And can we estimate $y$ based on the extracted factors (i.e. using the columns of $T$)?

muffin
  • 5
  • Should it not be U rather than T? – ReneBt Jan 25 '20 at 23:11
  • Do you mean $\hat{y}=U\cdot Q^{T}$? I've also saw this formula somewhere, but there is still a difference between the estimates of $y$ from the output and the ones obtained via the matrix multiplication.Hence, I thaught that it may be $T\cdot Q^{T}$ since the matrix $T$ is somehow related to the predictors. – muffin Jan 25 '20 at 23:29
  • I think it depends on what algorithm is used. Are you using NIPALS or SIMPLS? – gunakkoc Jan 25 '20 at 23:32
  • I use SAS. The method is PLS, while the algorithm is SVD. – muffin Jan 25 '20 at 23:44

0 Answers0