Cheap recalculation of eigenvalues and eigenvectors for a low-rank update of the matrix

Question

Suppose I have a correlation matrix, $A$, and I already have the eigenvalues and eigenvectors of this matrix.

For a given vector, $\mathbf{\mathit{v}}$, I want to calculate the eigenvalues and eigenvectors of the following new matrices:

$$A+\mathbf{\mathit{v}}\mathbf{\mathit{v}}^T\textrm{ and }A-\mathbf{\mathit{v}}\mathbf{\mathit{v}}^T$$

Some relevant questions on the website:

Efficient eigen-decomposition of covariance matrix

@Richard I want to calculate $A+vv^T$ and $A-vv^T$ efficiently at least with complexity lower than $\mathcal{O}(n^3)$ by using knowledge of the eigen-decomposition from $A$ — Max Wong, Sep 03 '20 at 18:25
You could try using inverse iteration starting with the eigenvalues of $A$, but the work to solve the different linear systems for each eigenvalue would probably be so large that it would be faster to just find the eigenvalues of $A+vv^{T}$ from scratch without using the previous eigenvalues. — Brian Borchers, Sep 03 '20 at 22:54

score 10 · Answer 1 · answered Sep 03 '20 at 17:20

10

Unfortunately, I don't think there is a good algorithm to do this efficiently.

Given the eigendecomposition $\mathbf A = \mathbf X \mathbf D \mathbf X^T$, one is tempted to project $\mathbf v$ onto the eigenvectors by introducing the vector $\mathbf u = \mathbf X^T \mathbf v$, forming $\mathbf A + \mathbf v \mathbf v^T = \mathbf X \left(\mathbf D + \mathbf u \mathbf u^T \right) \mathbf X^T$, and then attack the inner system (a diagonal matrix with a rank-one update) with something clever. Golub outlines an algorithm for computing such an eigendecomposition here, which requires only $\mathcal O(n^2)$ flops. The catch is that, even once you possess the decomposition $\mathbf D + \mathbf u \mathbf u^T = \mathbf Y\mathbf D_2\mathbf Y^T$, you are still faced with issue of composing the "final" decomposition, $\mathbf A + \mathbf v \mathbf v^T = \left(\mathbf X \mathbf Y \right) \mathbf D_2 \left(\mathbf X \mathbf Y \right)^T$, and to explicitly tabulate the "final" eigenvectors $\mathbf X \mathbf Y$ will require $\mathcal O(n^3)$ flops. This spoils the complexity of doing something clever, it's asymptotically no better than just accumulating $\mathbf A + \mathbf v \mathbf v^T$ and using the common/usual algorithm.

It's worth pointing out that if you're lucky, and $\mathbf v$ is known to be an eigenvector of $\mathbf A$, the eigendecomposition of $\mathbf A + \mathbf v \mathbf v^T$ is easy to puzzle out (all the eigenvectors are the same, and only the eigenvalue associated with $\mathbf v$ will change). With some effort this idea can be extended to the case where $\mathbf v$ is a linear combination of just a handful / $\mathcal O(1)$ of the original eigenvectors. Unfortunately, an arbitrary $\mathbf v$ is probably a combination of all the eigenvectors of $\mathbf A$, which spoils this line of attack for the general case, too.

So, I am pessimistic on this question (but would be happy to be proven wrong).

answered Sep 03 '20 at 17:20

rchilton1980

4,862
13
22

I think the way you point out is very clever. I agree that the algorithm might be very difficult to optimize. However, I think your proposal can reduce computation time as long as the computation of $XY$ is faster than computing the eigen-decomposition of $A$ directly (even if they're all on the magnitude of $\mathcal{O}(n^3)$ – Max Wong Sep 03 '20 at 18:31
Potentially related: https://math.stackexchange.com/q/3052997 – Abdullah Ali Sivas Sep 03 '20 at 18:35
1

Incidentally, your answer shows that one can compute the eigenvalues alone (not the eigenvectors) in $O(n^2)$, which is already non-obvious. – Federico Poloni Sep 03 '20 at 18:46
It's a joy every time to read a good answer: this one doesn't just answer the what, but also the why! – Wolfgang Bangerth Sep 03 '20 at 19:10
1

I might have a follow-up, though: The question originally likely starts from wanting to create an "online" algorithm, for example if one is interested in addressing how the covariance matrix (or its eigenvalues) changes with every incoming new sample being measured. In these cases, the changes to the eigenvalues are likely quite small if one already has a lot of samples. This opens up the possibility of thinking of inexact, iterative algorithms to update the eigenvalues: with expected small changes, one might think that one would only need to do 1 or 2 iterations with every new sample. Ideas? – Wolfgang Bangerth Sep 03 '20 at 19:14
@WolfgangBangerth this is indeed my intention of asking the question :) – Max Wong Sep 03 '20 at 20:06
Can that be quantified somehow? Perhaps we expect $|\mathbf A| >> |\mathbf v \mathbf v^T|$, or similar? – rchilton1980 Sep 03 '20 at 20:08
@rchilton1980 what do you mean by $|A|$? – Max Wong Sep 03 '20 at 20:10
Matrix norm, any flavor really. – rchilton1980 Sep 03 '20 at 20:19
I don't think that's necessarily true, at least not for any of the typical norms that take the max over something. I'm thinking of something like $A_{n} = A_{n-1,n-1}+v_nv_n^T$ and then asking that $|\lambda_{n,i}-\lambda_{n-1,i}| \ll \max_j |\lambda_{n-1,j}|$ for all $i$, where $\lambda_{n,i}$ is the $i$th eigenvalue of $A_n$. In other words, the updates to the eigenvalues are small compared to the size of the maximum eigenvalue. (They may be of appreciable size for the small eigenvalues, but we generally don't care about them too much.) – Wolfgang Bangerth Sep 03 '20 at 21:33
@rchilton1980, yes. that's true. We can be confident to say that $|A|>>|vv^T|$. $A$ is typically of size $2000\times500$. And each entry of $A$ and $v$ can be thought of as random samples from $\mathcal{N}(0,1)$. Even if there's some outliers, I can winsorise it. – Max Wong Sep 04 '20 at 02:18
@WolfgangBangerth, is Gershgorin's theorem useful there? The diagonal part of the update will shift the centres of the Gershgorin's discs, while the off-diagonal parts will change the radius. Hence, just by looking at $uu^T$ term we have the bound $|\lambda_{n,i}-\lambda_{n-1,i}|<\max_k (\sum_{j} |u_ku_j|)$. So if the largest eigenvalue is much greater than the max abs row sum of the update, we wouldn't expect a big change. – Abdullah Ali Sivas Sep 04 '20 at 03:54
@WolfgangBangerth, I was doing some reading and it looks like I reinvented the wheel. Weyl's theorem is more general and more useful, Wikipedia Link: https://en.wikipedia.org/wiki/Weyl%27s_inequality#Weyl's_inequality_about_perturbation . It implies $\lambda_{\max}(A+uu^T)-\lambda_{\max}(A)\leq ||u||_2^2$, in addition to its other uses. Ilse Ipsen, together with Nadler, has a pretty good paper on it "Refined Perturbation Bounds for Eigenvalues Ofhermitian and Non-Hermitian Matrices" – Abdullah Ali Sivas Sep 04 '20 at 06:18
@AbdullahAliSivas: Right, but that only gives me a bound on the change in eigenvalues. In the spirit of the original question, I was hoping for an algorithm that produces an "approximate update" to the eigenvalues. – Wolfgang Bangerth Sep 04 '20 at 21:43

Cheap recalculation of eigenvalues and eigenvectors for a low-rank update of the matrix

1 Answers1

Linked