7

I trying to wrap my head of derivation of the analytic FEM Jacobian for the Newton method. Say we have a nonlinear Poisson problem of the (weak) form

$$ \int a(u)\nabla\ u\cdot \nabla v = \int f v $$

where $a(u)$ is a coefficient. From the derivation of the Newton method the Jacobian matrix (in the Newton update $\Delta u = -J(u_k)R(u_k) $ will be

$$ J(u) = \int a(u)\nabla\ \phi_i\cdot \nabla \phi_j + \int \frac{\partial a(u)}{\partial u}\phi_i\nabla\ u\cdot \nabla \phi_j $$

which essentially is the original stiffness matrix plus a contribution due to the derivative of $a(u)$. Now if for example $a(u)=u^2$ then the 2nd Jacobian term will be $J_2=\int 2u\ \phi_i\nabla\ u\cdot \nabla \phi_j$ which is ok. But if now $a(u)=(\frac{\partial u}{\partial x})^2$ or even $a(u)=\frac{\partial u}{\partial x}\frac{\partial u}{\partial y}$, how does this work as the problem is still nonlinear but $\frac{\partial a(u)}{\partial u}$ is essentially zero?

EDIT: Could someone verify that these Jacobian forms are indeed correct so that I have understood it correctly.

$$ a(u) = u\frac{\partial u}{\partial x} \Rightarrow J_2 = \int (u\frac{\partial \phi_i}{\partial x} + \frac{\partial u}{\partial x}\phi_i)(\nabla\ u\cdot \nabla \phi_j) $$

$$ a(u) = (\frac{\partial u}{\partial x})^2 \Rightarrow J_2 = \int 2\frac{\partial u}{\partial x}\frac{\partial \phi_i}{\partial x}(\nabla\ u\cdot \nabla \phi_j) $$

$$ a(u) = \frac{\partial u}{\partial x}\frac{\partial u}{\partial y} \Rightarrow J_2 = \int (\frac{\partial u}{\partial x}\frac{\partial \phi_i}{\partial y} + \frac{\partial u}{\partial y}\frac{\partial \phi_i}{\partial x})(\nabla\ u\cdot \nabla \phi_j) $$

where $u$ and $\nabla u$ are evaluated explicitly with the current solution in the assembly.

Anton Menshov
  • 8,672
  • 7
  • 38
  • 94
B Ring
  • 73
  • 4

2 Answers2

11

Your example is a pretty good indication that the two derivatives (with respect to $x$ and with respect to $u$) do not commute :) (In fact, they're very different beasts -- one is a Fréchet derivative, the other a weak derivative.)

Rather, you should consider $a(u) = (\partial_x u)^2$ as the composition $a = f\circ g$, of two functions $$ f: v(x)\mapsto v(x)^2$$ and $$ g: v(x) \mapsto (\partial_x v)(x)$$ and apply the chain rule $$ \partial_u a(u) = \partial_u f(g(u)) \circ \partial_u g(u).$$ Here, it is important to note that derivatives of such a mapping $a$ between functions are actually linear operators, not functions (and the function spaces are important, but I'll ignore that here) -- the derivative $\partial_u a(u)$ acts on a given function $h$ to show how the output $a(u)$ changes if the input $u$ is perturbed in the direction of $h$. When computing such derivatives, it is therefore very helpful to keep these directions around until the end!

For $f$ (which is called a superposition or Nemytskii operator), one can under some conditions (again, ignored here) show that for any function $h$ $$\partial_u f(v) h = 2 v * h,$$ where $*$ denotes the pointwise multiplication. (Basically, the rule is if you have a mapping $v(x)\mapsto f(v(x))$ for some differentiable $f:\mathbb{R}\to\mathbb{R}$, then its derivative in $u$ -- if it exists -- is given by the pointwise multiplication with $f'(u(x))$.)

For $g$, you can use that the (weak) derivative is linear and so the derivative is the linear operator itself, i.e., $$\partial_u g(v)h = \partial_x h.$$

Together, you obtain $$\partial_u a(u)h = 2\partial_x u * \partial_x h,$$ where $*$ denotes the pointwise multiplication. In your equation for $J(u)$, $\phi_i$ is taking the place of $h$, so the corresponding Jacobian term would be $$ \int_\Omega 2\partial_x u(x) \partial_x \phi_i(x) (\nabla u(x) \cdot \nabla \phi_j(x))\,dx.$$

In your second example, you proceed similar using the product rule instead of the chain rule to get $$\partial a(u)h = \partial_x u * \partial_y h + \partial_y u * \partial_x h.$$

Christian Clason
  • 12,301
  • 3
  • 48
  • 68
  • Thank you for the detailed explanation, I think I follow but shouldn't $a(u) = (\nabla u)^2$ then result in $$ \int_\Omega 2\nabla u(x) \nabla\phi_i(x)(\nabla u(x) \cdot \nabla \phi_j(x)),dx.$$ instead of $$ \int_\Omega 2u(x) (\nabla \phi_i(x) \cdot \nabla \phi_j(x)),dx.$$ where $2\nabla u(x) \nabla\phi_i(x)$ is due to $\partial_u a(u)$ and $(\nabla u(x) \cdot \nabla \phi_j(x))$ from the original bilinear form (now evaluated explicitly in $u$)? – B Ring Jan 11 '18 at 03:32
  • Yes, sorry, I forgot the gradient there. Fixing it now! – Christian Clason Jan 11 '18 at 07:47
  • ...and actually using your example and not the full gradient for $a(u)$. (What threw me initially is that I usually see the the nonlinearity as containing the full dependence on $u$, so I mixed up the gradients.) – Christian Clason Jan 11 '18 at 08:43
3

You need to understand how to actually compute derivatives when you want to take the derivative with respect to a function. I've recorded a lengthy example in lecture 31.55 here: http://www.math.colostate.edu/~bangerth/videos.html

In short, you need to consider the derivative as the limit of a variation.

Wolfgang Bangerth
  • 55,373
  • 59
  • 119