8

I have the following question:

Suppose I have two matrices $X, Y$ both of size $m\times p$ and a random i.i.d Gaussian matrix $G$ of size $m \times k$, $m\gg p>k$.

Is there a fast way to compute $\exp(-XY^T)G$? Perhaps by using the fact that both $X$ and $Y$ are much smaller than $XY^T$? Here, $\exp(\cdot)$ means entry-wise exponent (i.e. of $\textbf{each}$ entry of the matrix). Clearly, without the exponent, it's easy and can be done simply by $X(Y^TG)$, but the problem is that the element-wise exponent is no longer low rank.

The motivation for the question is multiplying a kernel of the form

$D_{ij}=\exp(-d_{ij})$ or $D_{ij}=\exp(-d_{ij}^2)$

with $d_{ij}=\Vert x_i-y_j \Vert$ and $x_i, y_i \in \mathbb{R}^p$ by a random matrix efficiently.

An approximated solution is also fine.

Gil
  • 382
  • 1
  • 10
  • 5
    I find the question clear (I don't have any ideas myself unfortunately), but the fact that the exponential is component-wise I think is going to mislead a lot of people - $\exp(A)$ is a very standard notation for matrix exponential, especially in the theory of ODEs and PDEs. Could you maybe rephrase the title a bit? – Kirill Nov 16 '15 at 19:50
  • Define fast? Do you mean in a vectorized way without looping over the whole matrix $-XY^T$? – nluigi Nov 17 '15 at 17:24
  • Computational complexity lower than $\mathbb{O}(m^2k)$. – Gil Nov 17 '15 at 21:54
  • 1
    NIPS'16 paper that might be relevant http://papers.nips.cc/paper/6246-orthogonal-random-features – Memming Feb 21 '17 at 13:50
  • 1
    I looked into the paper, it is very helpful. Thanks! @Memming – Gil Feb 28 '17 at 12:15

1 Answers1

1

If approximations suffice, maybe we could begin by developing the $\exp$ operator as follows: $$ \exp(Z) = \sum\limits_{k=0}^{\infty} \frac{Z^k}{k!} = 1+Z+\frac{Z^2}{2} + \frac{Z^3}{6} + \frac{Z^4}{24} + ... $$ Note that the power terms follow your element-wise notation and refer to the Hadamard powers, slightly abusing the standard conventions.

This gives possibilities for re-writing the expression: $$ \begin{align} \exp{(-XY^T)}G &= \Big(I-XY^T+\frac{(XY^T)^2}{2} - \frac{(XY^T)^3}{6} + \frac{(XY^T)^4}{24} - ...\Big)G \\ &=G-XY^TG+\frac{(XY^T)^2G}{2} - \frac{(XY^T)^3G}{6} + \frac{(XY^T)^4G}{24} - ... \end{align} $$ where $I$ is an appropriately sized identity matrix. If one could live with a first order approximations, then: $$ \begin{align} \exp{(-XY^T)}G &= G-XY^TG+ O(max(|XY^T|)^2)\\ &\approx G-XY^TG = G-X(Y^TG) \end{align} $$ resulting in an easier operation. For higher order approximations, you would like to work out the solution for the Hadamard power, which is probably a bit harder or I haven't seen an immediate solution yet.

Tolga Birdal
  • 2,229
  • 15
  • 24