7

In deriving a Newton scheme, I end up with a Jacobian matrix of the form $J=I+ab^T$ where $a,b$ are vectors. For practical reasons, I want to approximate it by a symmetric positive definite matrix. Symmetrizing is easy: replace it by $J \approx I+\frac 12(ab^T+ba^T)$. But the (now real) eigenvalues may still be negative, so I further approximate $J \approx \tilde J := I+\frac 12\alpha (ab^T+ba^T)$ where I want to choose $\alpha$ so that $\tilde J$ is s.p.d.

To choose $\alpha$, I need the eigenvalues of the matrix $ab^T$. This ought to be simple enough, but I must be missing the key step to find a closed-form expression for the eigenvalues. Clearly, there will be at most two non-zero eigenvalues, and the corresponding eigenvectors must lie in the plane spanned by the vectors $a,b$. The eigenvalues are then the minimal and maximal value of the Rayleigh quotient $$ R(x) := \frac{x^T (ab^T) x}{x^T x} = \frac{(a^Tx) (b^Tx)}{x^T x} $$ over all $x \in \text{span}(a,b)$.

I suspect that there is a closed form expression for the minimal and maximal value of $R(x)$, but its form eludes me.

Wolfgang Bangerth
  • 55,373
  • 59
  • 119
  • Isn't that simply $\lambda_{1,2} = a^Tb$? – Christian Clason Jul 06 '16 at 13:08
  • 3
    Sorry, should be $\lambda = a^Tb$, because $ab^T$ is rank 1 and therefore only has one non-zero eigenvalue (with eigenvector $a$). – Christian Clason Jul 06 '16 at 13:15
  • 2
    $(ab^T)a=(b^Ta)a$ implies that the right-eigenvector is $a$ and the eigenvalue is $b^Ta$ – Richard Zhang Jul 06 '16 at 13:56
  • 1
    Also, since $ab^T$ is nonsymmetric in general, the maximal absolute value of its Rayleigh quotient (i.e. its numerical radius) is in general strictly larger than its maximum absolute eigenvalue (i.e. its spectral radius). – Richard Zhang Jul 06 '16 at 14:00
  • Ah, interesting question. Maybe I have managed to mislead myself then. What I want is that $\tilde J$ is positive definite. For this, I actually need the minimal value of $R(x)$ (i.e., the most negative value, if $R(x)$ can actually be negative). As @RichardZhang points out, this may be a question unconnected to the eigenvalue(s) of $ab^T$. – Wolfgang Bangerth Jul 06 '16 at 20:51
  • If the OP would be somebody else I would answer with: Are you sure that you really want to approximate $ab^T$ or would you rather try the matrix inversion lemma aka Sherman-Morrison-Woodbury formula for $I+ab^T$? – Dirk Jul 07 '16 at 06:33
  • @dirk -- yes, fair question :-) This is the simplified version. What I state above is really just the matrix that sits in the middle of my bilinear form and is multiplied from the left and from the right by the gradient of shape functions. Well, really, it's multiplied by symmetric gradients of vector-valued functions, and the thing in the middle are rank-4 tensors, but if I understand the question above I will understand the problem I have ;-) – Wolfgang Bangerth Jul 07 '16 at 11:34

1 Answers1

7

This is sometimes known as Buzano's inequality (http://www.jstor.org/stable/2159168). In general, if $\|x\|=1$, $P=xx^{\top}$ is a projection operator, so a simple application of Cauchy-Schwarz leads to $$ 2|b^{\top}xx^{\top}a|-|b^{\top}a| \leq |b^{\top}(2P-I)a| \leq \|a\|\,\|b\|, $$ which gives Buzano's inequality $$ |b^{\top}xx^{\top}a| \leq \frac{\|a\|\,\|b\|+|a^{\top}b|}{2}. $$

Without loss of generality let $a=(1,0)$, $b=(p,q)$ ($p^2+q^2=1$, $q\geq0$) and $x = (\alpha, \beta)$ ($\alpha^2+\beta^2=1$, $\alpha\geq0$). Finding the extrema of $ b^{\top}xx^{\top}a = \alpha(\alpha p+\beta q)$ is easily done with Lagrange multipliers, giving the values $$ \frac{p\pm1}{2}. $$ Here $p = \frac{a^{\top}b}{\|a\|\|b\|} = \cos\angle ab$.

From this $$ \frac{a^{\top} b-\|a\|\|b\|}{2} \|x\|^2 \leq a^{\top} xx^{\top}b \leq \frac{ a^{\top}b+\|a\|\|b\|}{2} \|x\|^2 $$ (with both inequalities being tight) so the admissible values of $\alpha$ are given by $$ 1 + \tfrac12(p\pm1)\alpha \|a\|\|b\| > 0, \qquad \frac{-2}{\|a\|\|b\| + a^{\top}b} < \alpha < \frac{2}{\|a\|\|b\|-a^{\top}b}. $$

Kirill
  • 11,438
  • 2
  • 27
  • 51
  • it's been a while since you wrote this response, and I greatly appreciate it. Out of curiosity, do you agree that Buzano's inequality is not actually necessary? Starting with "Finding the extrema..." you have a second line of reasoning that leads to the result by simply computing the maximum and minimum eigenvalue directly, without apparent recourse to Buzano's inequality. – Wolfgang Bangerth Apr 04 '17 at 17:47
  • @WolfgangBangerth I think you're right. I don't remember, but I probably just found the inequality first, so then it became part of the answer. – Kirill Apr 04 '17 at 18:06
  • Then I made the same mistake as you -- I sat down to write out the argument (for the substantially more complex case I actually have) and tried to first make it via Buzano's inequality -- just to remove it all again when I realized that the whole thing can actually be done by just computing the min/max of the Rayleigh quotient by hand. – Wolfgang Bangerth Apr 04 '17 at 19:01
  • @WolfgangBangerth Could you be any more arrogant? You ask such questions and you're a professor...? That's embarrassing, I'm glad I didn't take any of your classes. – AppleblueberryCrumb Jul 15 '21 at 19:54