1

Suppose I have the following regression setting:

$y_n = f_n+ \epsilon$ where $f_n = f(x_n)$ and $\epsilon \sim N(0, \sigma^2)$

Let $\textbf{f} = [f(x_1), \ldots , f(x_N)]$, and $\textbf{y} = [y_1, \ldots , y_N]$

We assume $f \sim GP(0, k)$ so that $\textbf{f} \sim N(0, K)$ where $K$ is the $N\times N$ kernel matrix evaluated with points $x_1, \ldots x_N$ using kernel function $k$ of GP prior.

I am interested in calculating the posterior, $p(\textbf{f} | \textbf{y})$

This post gives the following expression for posterior: $p(\textbf{f} | \textbf{y}) \sim N\left(\sigma^{-2}\left( K^{-1} + \sigma^{-2}I\right)^{-1}\textbf{y}, (K^{-1} + \sigma^{-2}I)^{-1}\right)$ and I have personally gone through the derivation of this and found it correct.

However, equation 5 of this post gives the posterior as $p(\textbf{f} | \textbf{y}) \sim N(K(\sigma^2I + K)^{-1}\textbf{y}, \sigma^2 (\sigma^2I + K)^{-1}K)$. I have no idea how they derived this.

So which among these two is actually correct?

chesslad
  • 211

1 Answers1

2

I'll assume that the kernel $K$ is invertible and I'll start from the later expression and derive the former.

$K(\sigma^{2}I+K)^{-1}=K(\sigma^{2}K^{-1}K+K)^{-1}=K((K^{-1}+\sigma^{-2}I)\sigma^{2}K)^{-1}=\sigma^{-2}KK^{-1}(K^{-1}+\sigma^{-2}I)^{-1}=\sigma^{-2}(K^{-1}+\sigma^{-2}I)^{-1}$

$\sigma^{2}(\sigma^{2}I+K)^{-1}K=\sigma^{2}(K\sigma^{2}(K^{-1}+\sigma^{-2}I))^{-1}K=\sigma^{2}(K^{-1}+\sigma^{-2}I)^{-1}K^{-1}\sigma^{-2}K=(K^{-1}+\sigma^{-2}I)^{-1}$

Fiodor1234
  • 2,224