2

I have a PCA model in production. I take the 5 eigen vector which correspond to the 5 largest eigen values. Imaging the input data is 200 dimension, there will be a 5x200 matrix $V$ that project the raw data into eigen-space; after that, we project the low dimension 'feature' vector back to the 200 dimension raw space with $V^{T}$

Put together, the projection matrix is $V^{T}V$, which is 200x200 square

now here comes the question: if I need to generate such a matrix every day, how can I do sanity check on the projection matrix? Since the input samples are quite similar, this projection should not vary too much on daily basis. Therefore, I want to make sure that this matrix won't suddenly change a lot --- this is what I mean by 'sanity check'

I tried to plot a 'heatmap' out of the matrix, but it does not really tell much --- a 'top view' always shows the same thing.

What I am now using is the max min value of all the numbers in the matrix, but I feel it is not enough

Is there any other thing that I should be looking at? determinant ? trace?

any suggestions?

user152503
  • 1,489
  • 1
    Could you explain more specifically what a "sanity check" is intended to reveal? – whuber Jan 19 '18 at 21:21
  • If you have a number of matrices that you know are correct, you could compute the Frobenius norms of all their differences and average them. You could then compare that to the average of the Frobenius norms of the new matrix to your vetted set. Just a thought. – meh Jan 19 '18 at 22:38
  • first why do you have 5 x 200 instead of 200 x 5? – Onyambu Nov 19 '20 at 02:35
  • Maybe https://stats.stackexchange.com/questions/34396 solves your problem? – whuber Apr 20 '22 at 22:53

2 Answers2

1

Since the primary purpose of a projection matrix is to project arbitrary vectors onto a given vector subspace, a reasonable "sanity check" would be to take a random starting vectors, apply the projection matrix, and then check that the resulting vector is in the intended vector subspace. If it is substantially outside this (beyond what might be expected by rounding error) then that suggests that the projection matrix is wrong.

Ben
  • 124,856
0

Do you mean that you calculate new eigenvectors every day? Or you want to ensure that the same projection remains valid? The latter sounds safer to me.

The stability of an eigenvector calculation is revealed by the eigenvalues - if two consecutive eigenvectors have similar eigenvalues then the calculation is numerically unstable.

chrishmorris
  • 1,780
  • 1
    Consider the eigenvalues of an identity matrix... – jbowman Jan 20 '18 at 00:02
  • Yes, that's a good example. All vectors are eigenvectors of the identity matrix. No dimension reduction is possible. – chrishmorris Jan 22 '18 at 07:29
  • 1
    The point is that two consecutive eigenvectors have exactly the same eigenvalues, yet the calculation is certainly numerically stable. – jbowman Jan 22 '18 at 16:13
  • And if it's a projection matrix, the eigenvalues are all either 0 or 1 irrespective of stability of anything. – Thomas Lumley Mar 20 '22 at 02:05
  • Yes, as jbowman says, the calculation of eigenvalues are stable .. but the calculation of the eigenvectors is not, and this is the OP's concern. – chrishmorris Mar 21 '22 at 08:25