0

Recently, I've been struggling to understand the limitations and capabilities of the successive over-relaxation (SOR) method for boundary value problems which are ill-posed, such as, for instance, boundary value problem (BVP) in form of Poisson equation with pure Neumann boundary conditions ($\nabla^2u=f$ and $\frac{\partial u}{\partial a}\bigg|_{a=x,y,z}=n(a)$) (or any other kind of singular $Ax=b$ types of linear systems of equations). Some literature I've encountered goes as far as saying that SOR cannot even tackle such problems due to over-relaxations it is based on.

On the other hand, people often mention some practical tricks that might allow ill-posed problems to be solved, but they do not specify what is the form of such tricks for specific solver methods. An example would be the Tikhonov regularization where a small positive constant is added to the main diagonal of matrix $A$. However, based on my own experience and experience of some other stack exchange users, this may not always yield a desirable solution. Further, sometimes it is also proposed to replace a single point in the set of Neumann boundaries with a Dirichlet boundary. However, as pointed out here by DrHansGruber, this may not be a correct and desirable approach.

Therefore, my question is: is there a practical and functional way to make SOR method converge in case of an ill-posed BVP, and if so what is it and how to practically implement it?

Akhaim
  • 83
  • 10
  • I think it's worthwhile to make a distinction between ill-conditioned and ill-posed problems. The former have a unique solution (even if not stable), the latter have no or infinitely many solutions. The question confuses these concepts. – Wolfgang Bangerth Oct 10 '22 at 03:35
  • @WolfgangBangerth, can you point out what part of the question confuses the two concepts? Based on the specified example of the Poisson equation with pure Neumann boundaries, this should be clear that ill-posed problem was had in mind here (infinite solutions) – Akhaim Oct 10 '22 at 18:10
  • Tikhonov regularization is typically used to solve ill-conditioned problems, not ill-posed problems.Pinning a single point converts in ill-posed problem into an ill-conditioned one. I can see how one would use methods such as SOR to solve ill-conditioned problems, but if you have an ill-posed problem, the only reasonable step is to first sit down and define what the "solution" you are looking for actually is, given that ill-posed problems do not have a unique solution. – Wolfgang Bangerth Oct 10 '22 at 21:19

1 Answers1

3

I'll give a more general answer regarding a linear system $Ax = b$, since this is what you ultimately have to solve. I'll preface this by remembering that a problem is well-posed according to Hadamard if it has a solution, the solution is unique, and the solution's behaviour changes continuously with the initial conditions. Once you discretise your specific problem you get a system of linear equations $Ax = b$, so going forward, I will consider only systems of linear equations.

Existence of a Solution

A solution does not exist when $b$ is not in the range of $A$. If this arises in practice, one generally needs to study why the problem has no solution, since this is typically a modelling issue, and what needs to be done to rectify it.

A standard "remedy" is to opt for the solution that least violates the equation in the $L_2$ norm: $\min_{u}\|Au-b\|^2_2$, which results in the normal equations: $A^TAu = A^Tb$. Note however that in some applications this modification of the problem may not be satisfactory, so you really ought to check why your model results in an empty solution set before applying the above "remedy".

In the conjugate gradients for the normal equations (CGNR) solver, if the initial guess is set to zero, then one actually recovers $u^* = A^{+}b$ (where $A^{+}$ is the Moore-Penrose pseudoinverse of $A$). If the initial guess is different, then one gets a $u^*+v$ where $v$ is the part of the initial guess from the kernel/null-space of $A$.

Uniqueness of the Solution

The problem does not have a unique solution if $A$ (or $A^TA$ in the $\min_u\|Au-b\|^2_2$ case) is not full rank, since if we assume that $u^*$ is a solution, then $u^*+v$ is also a solution, where $v$ is from the kernel/null-space of $A$ (or $A^TA$). However if the initial guess for the vector is zero in a conjugate gradients (CG) solver then I believe that you will recover the unique solution $u^* = A^{+}b$.

Continuity of the Solution

The problem is a system of linear equations, so the the solution continuously changes with the initial conditions (provided that such a solution exists, but as noted one can just modify the problem to $\min_u \|Au-b\|^2_2$ in that case and solve $A^TAu = A^Tb$ instead).

Compatibility Conditions

Now let's go back to the continuous setting, and let your solution domain be $\Omega$. You have imposed a Neumann boundary condition of the form $\partial_v u(a) = n(a), \, a \in \partial\Omega$ where $v$ is the normal to the boundary at point $a$. Using the divergence theorem we can find that:

\begin{equation} \int_{\Omega} f = \int_{\Omega} \Delta u = \int_{\Omega} \nabla \cdot \nabla u = \int_{\partial \Omega} \partial_v u = \int_{\partial\Omega} n. \end{equation}

Then you know that the above compatibility conditions must be satisfied. One way to achieve this would be to change the mean value of $f$ to be equal to $\frac{\int_{\partial\Omega}n}{|\Omega|}$ which can be achieved in the following manner:

\begin{equation} g(a) = f(a) - \frac{\int_{\Omega} f}{|\Omega|} + \frac{\int_{\partial \Omega} n}{|\Omega|} \\\implies \\ \int_{\Omega}g = \int_{\Omega} f - \int_{\Omega}\frac{\int_{\Omega} f}{|\Omega|} + \int_{\Omega}\frac{\int_{\partial \Omega} n}{|\Omega|} = \int_{\Omega} f - |\Omega|\frac{\int_{\Omega} f}{|\Omega|} + |\Omega|\frac{\int_{\partial \Omega} n}{|\Omega|} = \int_{\partial\Omega} n \end{equation}

Then formulate the new problem:

\begin{equation} \Delta u (a) = g(a), \, a\in\Omega \\ \partial_v u (a) = n(a), \, a \in \partial\Omega. \end{equation}

The solution for the above problem exists, and it will exist in the discrete case too provided you use a sensible discretisation.

Solvers

Once you have taken into account this compatibility condition what solver should work on your system depends only on your system matrix $A$. Note that while a discretisation of $\Delta$ generally results in a symmetric matrix, once you add the boundary conditions this does not have to be the case anymore. If the conditions are such that your matrix is diagonally dominant or symmetric positive-definite then Jacobi, Gauss-Seidel, SOR, CG should all work. If your matrix is neither diagonally dominant nor symmetric positive (semi-)definite then you may have to rely on a solver such as CGNR (which would essentially solve $A^TAu = A^Tb$ which is symmetric positive (semi-)definite). I do not believe that any of those solvers requires uniqueness of the solution, it just requires that the solution exists (i.e. $b$ is in the range of $A$, while for CGNR not even this is required). If your matrix is positive semi-definite (i.e. you have infinitely many solutions) then which solution you end up in will depend on the initial guess at the beginning (I assume that for most solvers an initial guess of zero should lead you to the Moore-Penrose pseudoinverse solution $u^* = A^{+}b$, though I am not sure whether it holds for all or only for CG).

General Case

In the general case you may have a constraint that is less obviously implemented than the above. Then one usually uses a Lagrange multipliers approach. Assume that your constraints can be written as $Bu = n$, and the discretisation of your PDE without boundary conditions is $Au = f$, where $A$ is symmetric positive (semi-)definite. Then you can form the following quadratic programming problem with linear constraints:

\begin{equation} \min_u \frac{1}{2}u^T(Au-f), \,\textrm{s.t.}\, Bu = n. \end{equation}

One can use a Lagrange multipliers approach to rewrite this in the following manner:

\begin{equation} \min_u\max_{\lambda}\left(\frac{1}{2}u^T(Au-f) + \lambda^T(Bu-n)\right) \\ \implies \\ \begin{bmatrix} A & B^T \\ B & 0 \end{bmatrix} \begin{bmatrix} u \\ \lambda \end{bmatrix} = \begin{bmatrix} f \\ n\end{bmatrix}. \end{equation}

The above system is indefinite but symmetric, thus it suggests using a solver such as SYMMLQ. For this to work you must make sure that there is a solution however. A solution will not exist if you have conflicting constraints: $n$ is not in the range of $B$. Thus if you suspect that your constraints may be conflicting you can compute their projection on the range of $B$: $m = BB^{+}n$. The part $p = B^{+}n$ can be computed in practice using a CGNR solver initialized with the zero vector and solving the system $B^TBp = B^Tn$, when you can compute $m = Bp$. After this you just plug in $m$ instead of $n$ in the system resulting from the Lagrange multipliers approach. The solution may also not exist if $f$ is not in the range of $A$. Then you proceed in a similar manner: $g = AA^{+}f$.

lightxbulb
  • 2,082
  • 8
  • 14
  • many thanks for such complete answer. You managed to reflect and touch upon various relevant problems regarding the ill-posed BVPs. One question here, regarding the compatibility conditions: removing the mean value of the source term $f$, won't that lead to a 'new' problem being defined, as the new source term is created when one subtracts the mean of the source term? If this reasoning is correct, is there then a way to connect the old problem with the new one? – Akhaim Oct 10 '22 at 18:31
  • @Akhaim Your problem has no solution unless you satisfy the compatibility conditions. So yes you need to modify the problem to get one that has a solution. At the same time $u$ is defined up to a constant, so you can choose between a multitude of solutions provided the compatibility conditions are satisfied, and you have to figure out which of those you need. I can give you an example from image processing, the parabolic heat diffusion equation $\partial_t u = \Delta u$ with $u(0,x) = f(x)$ turns into an elliptic one for $t\to\infty$, namely: $\Delta u = 0$. If we want for it to be consistent – lightxbulb Oct 10 '22 at 18:54
  • ...with the solutions involving zero Neumann boundary conditions for $t<\infty$ then we require that $\int_{\Omega} u = \int_{\Omega} f$ which yields a unique solution. So the standard approach would be $g = f - f_{mean}$ and once we solve the problem (say with an initial guess of zero) we do $v = u + f_{mean}$. You should differentiate between the compatibility condition however (which allows the existence of a solution) and the mean value of $u$ which extracts a single solution out of infinitely many provided a solution exists in the first place. For some solvers the initial guess defines it – lightxbulb Oct 10 '22 at 18:56
  • A solver would lead you to a single solution even in problems that have infinitely many solutions (provided the solver is applicable to those), and the solution you end up in usually depends on the initial guess. What a solver cannot do is solve a problem that has no solution. There's a caveat here: if the solution set is outside the range of $A$ but within a distance matching the error tolerance to the range of $A$ then the solver may still converge. Another caveat is that using solvers like CGNR modifies the problem, and if will always have a solution. Understand your problem though rather. – lightxbulb Oct 10 '22 at 19:06
  • thanks once more for such excellent explanation. So, let me just reiterate this, as I am trying to grasp the essence of your message. Within the frame of proposed problem, (2D heat diffusion equation which in $t\to\infty$ becomes Laplace problem, $\Delta u=0$, with non-zero, finite Neumann BCs in all directions, $n_{x1,2;y1,2}$), we end up with $\Delta u=n_{x1}+n_{x2}+n_{y1}+n_{y2}=f_{new}$. From there, one defines a new problem, $\Delta v = g$, where $g=f_{new}-f_{new}^{(mean)}$, with some new initial guess for $v$, e.g. $v_{init} = 0$. As $\Delta v = g$ problem is solved... – Akhaim Oct 10 '22 at 19:29
  • ... one can revert the original $u$ as $u=v+f_{new}^{(mean)}\times C$, where $C$ is a constant that sets the units of $u$. Is my reasoning here correct? – Akhaim Oct 10 '22 at 19:31
  • @Akhaim I don't really like the chat function of stack. As far as the example goes - it is with zero or reflecting Neumann boundary conditions. In image processing this would be equivalent to a reflection along the boundaries, and the pure Neumann problrm corresponds to a convolution with a Gaussian with infinite standard deviation, so the solution is just the mean value. I believe you can reduce all other cases to the case of 0 bojndary conditions anyways by changing the right hand-side. The takeaway from my comment is that whatever you do the compatibility condition must be satisfied... – lightxbulb Oct 10 '22 at 20:38
  • ...otherwise there is no solution. If the compatibility condition is satisifed then there are infinitely many solutions and you can produce any of those by just shifting a known solution with a constant. In a problem wigh infinitely many solutions, in which solution you end up in depends on the initial guess in the solver, a zero guess usually leads to the "canonic" solution with zero offset. Another approach, that is less efficient, is to use the Lagrange multiplier approach to enforce a single solution. For example: $\Delta u = 0$ and $\int_{\Omega} u = \int_{\Omega} f$ can be implemented as – lightxbulb Oct 10 '22 at 20:45
  • $L u = 0$ so that $1^T u = 1^T f$ in the discrete case, where $L$ is the discretisation of the Laplacian with the corresponding boundary conditions. Then the system resulting from the Lagrange multipliers approach reads $\begin{bmatrix} L & 1 \ 1^T & 0 \end{bmatrix} \begin{bmatrix} u \ \lambda \end{bmatrix} = \begin{bmatrix} 0 \ 1^Tf \end{bmatrix}$. This is a symmetric but indefinite system, thus SYMMLQ is a good solver for it. – lightxbulb Oct 10 '22 at 20:49
  • Indeed, by packing Neumann boundary conditions into the right hand side (be it $f_{original}$ that is either a constant or a function of $x$ and $y$ (in 2D) with either zero mean or without zero mean) and removing the mean from $f_{original}+Neumann$, it should be as if some new, nonuniform $f_{new}(x,y)$ is introduced without mean value, and with equivalent 0-valued Neumann boundary condition. – Akhaim Oct 10 '22 at 21:02
  • And then, as far as I understand, I am to proceed to select one of the infinite available solutions, as previously described technique only ensures that solution will exist, but it does not pin down a specific one. I hope this is what you tried to explain. – Akhaim Oct 10 '22 at 21:05
  • @Akhaim Yes, that's basically it. As far as the solution goes I would suggest using a conjugate gradients (CG) solver with an initial guess of zero, then adding a constat to the solution at the end based on knowledge you have about the problem. As mentioned usually that's just the mean that was subtracted away. I recommend CG over SSOR as it doesn't require picking any parameters and has very good covergence, you need a symmetric matrix though. Beyond that multigrid and fast fourier solvers may work. I don't know the specific problem you are solving so I cannot say more. – lightxbulb Oct 10 '22 at 21:56