Safe application of iterative methods on diagonally dominant matrices

Question

Suppose the following linear system is given $$Lx=c,\tag1$$ where $L$ is the weighted Laplacian known to be positive $semi-$definite with a one dimensional null space spanned by $1_n=(1,\dots,1)\in\mathbb{R}^n$, and the translation variance of $x\in\mathbb{R}^{n}$, i.e., $x+a1_n$ does not change the function value (whose derivative is $(1)$). The only positive entries of $L$ are on its diagonal, which is a summation of the absolute values of the negative off-diagonal entries.

I found in one highly cited academic work in its field that, although $L$ is $not~strictly$ diagonally dominant, methods such as Conjugate Gradient, Gauss-Seidl, Jacobi, could still be safely used to solve $(1)$. The rationale is that, because of translation invariance, one is safe to fix one point (eg. remove the first row and column of $L$ and the first entry from $c$ ), thus converting $L$ to a $strictly$ diagonally dominant matrix. Anyway, the original system is solved in the full form of $(1)$, with $L\in\mathbb{R}^{n\times n}$.

Is this assumption correct, and, if so, what are the alternative rationale? I'm trying to understand how the convergence of the methods still hold.

If Jacobi method is convergent with $(1)$, what could one state about the spectral radius $\rho$ of the iteration matrix $D^{-1}(D-L)$, where $D$ is the diagonal matrix with entries of $L$ on its diagonal? Is $\rho(D^{-1}(D-L)\leq1$, thus different from the general convergence guarantees for $\rho(D^{-1}(D-L))<1$? I'm asking this since the eigenvalues of the Laplacian matrix $D^{-1}L$ with ones on the diagonal should be in range $[0, 2]$.

From the original work:

......................................

At each iteration, we compute a new layout (x(t +1), y(t + 1)) by solving the following linear system: $$ L · x(t + 1) = L(x(t),y(t)) · x(t) \\ L · y(t + 1) = L(x(t),y(t)) · y(t) \tag 8$$ Without loss of generality we can fix the location of one of the sensors (utilizing the translation degree of freedom of the localized stress) and obtain a strictly diagonally dominant matrix. Therefore, we can safely use Jacobi iteration for solving (8)

.......................................

In the above, the notion of "iteration" is related to the underlying minimization procedure, and is not to be confused with Jacobi iteration. So, the system is solved by Jacobi (iteratively), and then the solution is bought to the right-hand side of (8), but now for another iteration of the underlying minimization. I hope this clarifies the matter.

Note that I found Which iterative linear solvers converge for positive semidefinite matrices? , but am looking for a more elaborate answer.

Could you post a link or a citation to the highly cited work? — Geoff Oxberry, Mar 04 '12 at 05:12
It could be retreived from: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.164.1421 Since you are not expected to read the whole work, take a look at p.7 (bottom). I suppose the choice of iterative solvers is justified, but I feel a better (or, at least, different) rationale is needed. — usero, Mar 04 '12 at 14:22
I wonder whether these guys are from the same community as the combinatorial preconditioners. — shuhalo, Mar 05 '12 at 14:26

Jed Brown · Answer 1 · 2012-03-06T16:00:37.350

5

Krylov methods never explicitly use the dimensionality of the space they iterate in, therefore you can run them on singular systems so long as you keep the iterates in the non-null subspace. This is normally done by projecting out the null space at each iteration. There are two things that can go wrong, the first is much more common than the second.

The preconditioner is unstable when applied to the singular operator. Direct solvers and incomplete factorization may have this property. As a practical matter, we just choose different preconditioners, but there are more principled ways to design preconditioners for singular systems, e.g. Zhang (2010).
At some iteration, $x$ is in the non-null subspace, but $A x$ lives entirely in the null space. This is only possible with nonsymmetric matrices. Unmodified GMRES breaks down in this scenario, but see Reichel and Ye (2005) for breakdown free variants.

To solve singular systems using PETSc, see KSPSetNullSpace(). Most methods and preconditioners can solve singular systems. In practice, the small null space for PDEs with Neumann boundary conditions are almost never a problem as long as you inform the Krylov solver of the null space and choose a reasonable preconditioner.

From the comments, it sounds like you are specifically interested in Jacobi. (Why? Jacobi is useful as a multigrid smoother, but there are much better methods to use as solvers.) Jacobi applied to $A x = b$ does not converge when the vector $b$ has a component in the null space of $A$, however, the part of the solution orthogonal to the null space does converge, so if you project the null space out of each iterate, it converges. Alternatively, if you choose a consistent $b$ and initial guess, the iterates (in exact arithmetic) do not accumulate components in the null space.

edited Mar 06 '12 at 16:00

answered Mar 05 '12 at 13:53

Jed Brown

25,650
3
72
130

You could perform an orthogonal change of basis so that there is a zero on the diagonal (find any orthogonal matrix $Q$ in which the first column is the constant vector). Under this transformation $A_1 = Q^T A Q$, the matrix $A_1$ is still symmetric positive semi-definite, but the first diagonal entry is 0 so direct application of Jacobi would fail. Since $A_1$ is dense, you wouldn't do this in practice, but this shows that the basis matters. If $Z$ is an orthogonal basis for the null space, projected GMRES is just solving $(I-Z)P^{-1} A x = (I-Z)P^{-1} b$. – Jed Brown Mar 05 '12 at 15:45
Hmm, it seems I replied to a comment that was deleted. I'll leave the comment here in case it's useful. – Jed Brown Mar 05 '12 at 15:46
Thanks for the answer, it's on much higher specialized level then I expected. Therefore, I'll need some guides on: 1) how to project out the null space at each iteration? 2) In my understanding, you stated that the Jacobi application to the system as stated primarily might not converge to the exact solution (i.e. the iterands are not getting better solution estimates). It is therefore suggested to choose different preconditioners? If so, does that practically imply a dynamic check on behaviour with $diag(A)$, and change if problem occurs (with the above case of the linear system)? – usero Mar 05 '12 at 15:50
My 1) from above should be regarded as: given the Jacobi iteration with the system primarily posted, is it needed to project out the nullspace, and, if so, how could one incorporate it within the update $X_{k+1}=D^{-1}(b-(A-D)X_k)$? Postprocessing the iterate $X_{k+1}$, and considering the postprocessed version for $X_{k}$? – usero Mar 05 '12 at 15:59
1

In a reasonable basis, Jacobi should be stable. It can also use 1 on the diagonal if the diagonal matrix element is 0, the projection still removes the null space. Are you planning to use a Krylov method like CG or GMRES? If not, why not? If you are, you just need an orthogonal basis for the null space. You only have the constant mode in your null space, so an orthogonal projector into the null space is $N=ZZ^T$ where $Z$ is the column vector. The orthogonal projector that removes the null space is thus $I-N$. (My first comment had a mistake, if $Z$ is the basis, $N=I-ZZ^T$ is the projector.) – Jed Brown Mar 05 '12 at 16:04
Thanks for answering. I'm mainly concerned about the guaranteed convergence of Jacobi with the above system, since the solver has been implemented, and the method is re-used on many other places. Actually, sometimes (rarely) there are problems, but the authors explained it as numerical imprecision (that's was raised my doubts). So, the conclusion is that the spectral radius of the iteration matrix $\rho[D^{-1}(D-L)]<1$ (since it often converges)? – usero Mar 06 '12 at 09:13
I added a note to my answer. – Jed Brown Mar 06 '12 at 16:01
Your answer gets clearer; Note that the null-space of the Laplacian L is spanned by $(1,\dots, 1)\in\mathbb{R}^n$. As for the initial guess, note that the actual system form is $$Lx=L_wy,$$ where $L_w$ is another weighted Laplacian, and the initialization is $x_0=y$. I hope the reasons for well behaved approach Jacobi approach to solve the system are emerging (and I apologize for not providing the initialization and null-space details earlier) – usero Mar 06 '12 at 16:56
:Could you be more specific with "Jacobi applied to $Ax=b$ does not converge when the vector b has a component in the null space of $A$, however, the part of the solution orthogonal to the null space does converge, so if you project the null space out of each iterate, it converges". 1) How could one show that formally? 2) Does that imply that with $b$ having zero component corresponding to the vectors spanning null-space of $A$, Jacobi would be convergent, and this would imply that each Jacobi iterate will have a zero component in the null-space of $A$? – usero Apr 26 '12 at 20:13

Hui Zhang · Accepted Answer · 2012-03-26T08:24:45.627

5

The Jacobi iteration can be proved convergent.

The first thing you should make sure is that $c^T \mathbf{1}_n = 0$, which is the condition for existence of solution (I assume $L=L^T$, otherwise you need $c\in (\mathrm{Ker} L^T)^\perp$) because you said $V_0:=\mathrm{Ker} L = \mathrm{span}\{\mathbf{1}_n\}$. We will use the convention that $V_0$ is also the matrix with columns being orthonormal basis of it. In your case, $V_0:=\mathbf{1}_n/\sqrt{n}$.

Then, for the errors of the Jacobi iteration on the original system, you have $$ e_1 = (I-D^{-1}L)e_0 = (I-D^{-1}L) (P e_0 + V_0a)=(I-D^{-1}L) P e_0 + V_0a,$$ where $P:=I-V_0V_0'$ is the orthogonal projection onto $V_1:=V_0^\perp$. From the above iteration, we know that $$ P e_1 = P (I-D^{-1}L) P e_0, $$
from which we have the iteration matrix $S$ in $V_1$, $$ S: = P (I-D^{-1}L) P. $$ Not that $S$ has the same spectra (except zeros) with the following matrix $$ \tilde{S}:= (I-D^{-1}L) P P=(I-D^{-1}L) P=(I-D^{-1}L)(I-V_0V_0')\\ =I-D^{-1}L-V_0V_0'.$$ We want the spectral radius of $S$ less than one to prove the convergence.

The following quote is old and kept only for reference. See after for the new proof.

In your case, $V_0V_0'=\frac{1}{n}\mathbf{1}_{n\times n}.$ And you can verify that $D^{-1}L+V_0V_0'$ is strictly diagonal-dominant by using the assumption that the entries of $L$ are positive on the diagonal and negative otherwise. To show the eigenvalues of $D^{-1}L+V_0V_0'$ are real, we note that the matrix is self-adjoint under the inner product $<x,y>:=y^TDx.$

If $V_0$ is not in your specific form, I have not found an answer to the convergence question. Could someone clarifies this?

Note that $V_0$ is the eigen-vector corresponding to the eigenvalue $1$ of $I-D^{-1}L$. Based on the observation, we call Theorem 2.1 from Eigenvalues of rank-one updated matrices with some applications by Jiu Ding and Ai-Hui Zhou.

Theorem 2.1 Let $u$ and $v$ be two $n$-dimensional column vectors such that $u$ is an eigenvector of $A$ associated with eigenvalue $\lambda_1$. Then, the eigenvalues of $A+uv^T$ are $\{\lambda_1+u^Tv,\lambda_2,\ldots,\lambda_n\}$ counting algebraic multiplicity.

Then, we know that the spectra of $\tilde{S}$ is the same as $I-D^{-1}L$ except that the eigenvalue $1$ in the latter is shifted by $-1$ into the eigenvalue zero in the former. Since $\rho(I-D^{-1}L)\subset (-1,1]$, we have $\rho(\tilde{S})\subset (-1,1)$.

edited Mar 26 '12 at 08:24

answered Mar 10 '12 at 10:28

Hui Zhang

1,319
7
16

Thanks for answering. Something similar was what I've considered: namely, with the weighted Laplacian structured as the $D^{-1}L$ above, it could be shown that its eigenvalues are within $[0, 2)$, hence with the spectral radius within $(0, 2)$ (one eigenvalue is greater than $0$, and at least one is $0$). Therefore, the spectral radius of the iteration matrix $I-D^{-1}L$ is less then $1$, hence with convergent Jacobi. Perhaps the above assumption on the spectral radius of $I-D^{-1}L$ (excluding $0$) is not safe? – usero Mar 10 '12 at 11:51
I think the spectra of $D^{-1}L$ should be in $[0,2]$, that is closed at $2$. I do not know how you can get $2$ excluded. From my point of view, the (Gershgorin circle theorem) [http://en.wikipedia.org/wiki/Gershgorin_circle_theorem] can only give the estimate including $2$. If it is the case, the estimate of the spectral radius of $I-D^{-1}L$ is $\leq 1$ with the equality achievable with the vectors in the kernel of $L$. I think the convergence you want is that in the orthogonal complement space $V_1$ as noted in the above 'answer'. – Hui Zhang Mar 10 '12 at 12:07
You could take a look at Lemma 1.7 (v) of math.ucsd.edu/~fan/research/cb/ch1.pdf The matrix $D^{−1}L$ could be regarded as a weighted Laplacian on a complete graph, hence with excluded $2$. I guess it is a sufficient argument for the convergence proof?...........Does your approach require other pre/post-processing of the iterates beyond centering $c$. I'm asking because you introduced $V_0$ And regarding the spectra of $I-D^{-1}L-V_0V_0'$: given that the spectral radius ($sr$) of $I-D^{-1}L$ is $(0, 1]$, the addition of $-\frac{1}{n}$, would yield $sr<1$. Isn't this a good enough argument? – usero Mar 12 '12 at 08:26
Hi, thanks for pointing to a good book. But I found I can not take a quick look. About your last argument, it apppears almost the same as the "answer" above. Just be careful, you are not adding $\frac{1}{n}$ but $\frac{1}{n}\mathbf{1}_{n\times n}$, so it is not a simple addition to the $sr$ of $I-D^{-1}L$. Generally, the $sr$ of sum of two matrices are not simple sum of the $sr$'s of the individual matrices. – Hui Zhang Mar 12 '12 at 10:05
Good that you pointed that out. Does your approach require other pre/post-processing of the iterates beyond centering c. I'm asking because you introduced $V_0$, and I thought that you're talking about projecting out the null-space. If so, is projection the null-space out really necessary for convergence? – usero Mar 12 '12 at 10:27
You could use or not use the projector in your algorithm. For both cases, what you can expect to converge is the projected component. If you do not use, take care that the iterates you get may not converge and even it converges I do not know what is its component in the kernel of $L$. This needs more considerations. – Hui Zhang Mar 12 '12 at 19:17
I still have difficulties understanding why you introduced $V_0$ and extended $e_1$ by introducing $P$. Note that the right hand side of the original system is centered, ie., it always holds that $c^T1_n=0$. In your first comment above ("I think the"), are you implying that with each iterate one should project out the $1_n$ component to ensure convergence? If so, I wonder if this is necessary since $c^T1_n=0$, and for initial $x_0$ holds $x_0^T1_n=0$, hence with the $1_n$ component projected out. Given such initial $x_0$, is it necessary to project out $1_n$ for each iterand $x_k$ – usero Mar 22 '12 at 09:25
In fact, $L$ is invariant in $V_1$ because $V_0=Ker (L)$ but $D^{-1}(Lx_0)$ can send $Lx_0$ outside $V_1$. A counter-example: take $y=(1,-1)^T$ satisfying $y^T(1,1)^T=0$ and $D=diag(1,2)$. However, you need only to project it once after the last iteration. It gives the same result as if you project at every iteration. I assumed that you really want a solution in $V_1$ but not the solution $u_0+u_1$ with some non-zero component $u_0\in V_0$. – Hui Zhang Mar 23 '12 at 20:02
Ok; so, your $Pe_1$ from above implies you're projecting out the $1_n$ component with each iterand? Basically, I could $let$ plain Jacobi with the above system, but with $c^T1_n=0$, and $x_0^T1_n=0$. Only at the end, I project out $1_n$ from the last iterand $x_l$? – usero Mar 24 '12 at 16:07
No, in the answer $Pe_1$ is only introduced for the proof not implying the numerical implementation. For the rest, you are right. – Hui Zhang Mar 24 '12 at 22:15
Understood. As for the spectral radius of $I-D^{-1}A-V_0V'_0$, since $\rho(I-D^{-1}A)\in(0,2]$ and $\rho(\frac{1}{n}V_0V'_0)=1$, could I safely deduce that $\rho(I-D^{-1}A-V_0V'_0)<1$? Again, thank you for your persistance. – usero Mar 26 '12 at 07:33
Hi, I have just given a new proof. See it in the answer. – Hui Zhang Mar 26 '12 at 08:25
I understand that you shift the eigenvalue of $1$ of $S$ to the eigenvalue of $0$ of $\tilde{S}$, but could it happen that there is more than one eigenvalue of $1$ in $S$? Then, the spectral radius of $\tilde{S}$ would be $\leq 1$. – usero Apr 02 '12 at 14:33
Even it happens as you said, the theorem does count the algebraic multiplicity so as long as the dim of eigen-space associated with $1$ is equal to the algebraic multiplicity, you can take $V_0$ as the eigen-space's orthonormal basis and the analysis still holds. – Hui Zhang Apr 02 '12 at 16:22
Does "counting algebraic multiplicity" imply that each occurrence of the eigenvalue 1 is shifted to 0? Could you be more specific with the above comment. Given more eigenvalues of $1$, how to avoid $\rho(\tilde{S})\leq 1$? – usero Apr 04 '12 at 11:10
This is a good question. I need to read that paper to give you answer. – Hui Zhang Apr 04 '12 at 17:18

Safe application of iterative methods on diagonally dominant matrices

2 Answers2

Linked