4

It was suggested on math.stackexchange.com that I try to ask this question here.

Consider a bounded linear operator $A : U \to V$ where $U$ is finite dimensional and where $V$ is a separable Hilbert space, or with large dimension such that $A$ cannot be stored explicitly, being instead available as an algorithm for computing $y \mapsto Ay$. Let $f \in V$ and consider the standard least-squares problem: Find $x \in U$ such that $\|Ax-f\|=\text{min}!$. We assume that inner products of columns of $A$, i.e., the matrix elements of $M = A^*A$, can be computed efficiently, and similarly $A^*f$ can be computed efficiently. However, we also assume that that $M = A^*A$ is ill-conditioned.

Are there efficient algorithms for the solution of this least-squares problem that avoids the ill-conditioning of $A$?

Additional details added in edit:

In my concrete example, $A$ is on the form $$ Ae_i = u_i \in V = L^2(\mathbb{R}^n),$$ with prescribed vectors $u_i$, and where $\{e_i\}$ is some orthornormal basis for $U$. The normal equations read $$ Mx = A^* f, \quad M = A^* A,$$ which takes the usual matrix form when we expand $x = \sum_i x_i e_i$, and introduce the matrix with elements $M_{ij} = \langle u_i,u_j\rangle_{L^2}$, and the vector $(A^*f)_i = \langle u_i, f \rangle_{L^2}$. Thus, the normal equation is easy to write down and solve if $M$ is not ill-conditioned. However, it turns out in practice to be. The LSQR algorithm mentioned in the comments does not seem viable here, as it requires an algorithm for $u \mapsto A^* u$.

To be even more concrete, in the $V = L^2(\mathbb{R})$ case, I have $n = 3K$ basis functions. The first $K$ basis functions are gaussian functions $u_i(x) = \exp(-\alpha_i(x-z_i)^2)$, with $\alpha_i,z_i\in\mathbb{C}$. The remaining $2K$ basis functions are the derivatives of $u_i$ with respect to the complex parameters. Moreover, $f\in L^2$ is a function for which I have an algorithm for computing $\langle u_i, f\rangle_{L^2}$, $1\leq i \leq 3K$. Note that inner products of any pair of gaussians and their derivatives are readily computed, so I also have an algorithm for $M = A^* A$.

Jas Ter
  • 149
  • 3
  • Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. – Community Dec 02 '21 at 14:49
  • You're right to be concerned about the conditioning of $A^*A$. I think some kind of incremental SVD could help -- see this question or this paper. – Daniel Shapero Dec 02 '21 at 16:57
  • Is the matrix $A$ sparse? – Federico Poloni Dec 03 '21 at 14:49
  • In my edit I tried to explain my concrete problem. The matrix $A$ must really be considered an operator: When acting on a vector it gives a linear combination of some fixed Hilbert space vectors (actually, gaussian functions). So $A$ is in a sense sparse, but is not represented by a matrix. – Jas Ter Dec 04 '21 at 18:57

1 Answers1

4

There's no reason to compute elements of $M=A^{*}A$ here. You will need the ability to compute the adjoint operator $z \rightarrow A^{*}z$. With that, you can use a matrix-free iterative least-squares algorithm like LSQR. Because your problem is ill-conditioned, you'll need to add some regularization (this is available as an option in LSQR.)

Brian Borchers
  • 18,719
  • 1
  • 39
  • 70
  • 1
    Thanks! LSQR seems interesting. However, in my case, $A$ in facts maps into a Hilbert space, so the action of $A^*$ cannot really be considered and I believe the conditions of the algorithm are not satisfied. I will check though. – Jas Ter Dec 03 '21 at 13:27
  • You can also regularize the normal equations by solving $(M+\lambda^{2} I) x=A^{*}f$. This is equivalent to minimizing $\min | Ax - f |^{2}+\lambda^{2} | x |^{2}$. You really need to consider why this least-squares problem is ill-conditioned and whether regularization is appropriate and exactly what regularization to use. – Brian Borchers Dec 04 '21 at 17:42
  • In the question, you state that you can find $A^{*}f$. If so, you can use the same approach within an iterative least squares solver. – Brian Borchers Dec 04 '21 at 17:44
  • I sort of understand where the ill-conditioning comes from. My condition numbers are in the $10^{20}$ range, and seem to be even larger in some cases. By basis functions that comprise $A$ are gaussians and their derivatives with respect to, say, the center of the gaussian. Thus, I can compute $A^*f$ for $f$ being gaussian or certain other special functions, but not general $f$... – Jas Ter Dec 04 '21 at 19:07
  • I know that the basis vectors are linearly independent in exact arithmetic, and finding a preconditioner should amount to taking clever linear combinations of the basis functions. – Jas Ter Dec 04 '21 at 19:08
  • These are the kind of details that should be added to your question. – Brian Borchers Dec 04 '21 at 19:22
  • Thanks, I added some details in a second edit. – Jas Ter Dec 04 '21 at 19:44