I am reading Gaussian Process for Machine Learning equation 2.9, where it is deriving the predictive distribution
$$p(f_* | \mathbf{x}_*, X, \mathbf{y}) = \int p(f_* | \mathbf{x}_*, \mathbf{w}) p(\mathbf{w} | X, \mathbf{y}) d \mathbf{w}.$$
I tried to do it analytically like in equation 2.8 but the terms are all over the place. I also tried to treat it as a delta function as in here. But I still cannot understand it.
$$ \begin{align*} p(f_* | \mathbf{x}_*, X, \mathbf{y}) &= \int p(f_* | \mathbf{x}_*, \mathbf{w}) p(\mathbf{w} | X, \mathbf{y}) d \mathbf{w} \\[5pt] &= \int \delta(f_* - \mathbf{x}_*^T \mathbf{w}) p(\mathbf{w} | X, \mathbf{y}) d \mathbf{w}. \end{align*} $$
Properites of delta functions includes
$$ \begin{align*} \int \delta(x - x_0) f(x) dx = f(x_0) \end{align*} $$
which I cannot see how to use it here. I also read the matrix variant of the delta function from the matrix cookbook equation 548
$$ \int \delta(\mathbf{x} - A\mathbf{s}) p(\mathbf{s}) d\mathbf{s} = \frac{1}{\sqrt{\det(A^TA)}} p(A^+\mathbf{x}), $$
where $A^+$ is the psudo inverse of $A$. Then
$$ \begin{align*} p(f_* | \mathbf{x}_*, X, \mathbf{y}) &= \frac{1}{\sqrt{\det(\mathbf{x}_* \mathbf{x}_*^T)}} p_{\mathbf{w}}((\mathbf{x}_* \mathbf{x}_*^T)^{-1} \mathbf{x}_* f_* | X, \mathbf{y}) \end{align*} $$
and I am stuck here. Sorry for creating one more post because I cannot make comments right now.
most generalin the sense of directly using delta function properties as you've done – muser Aug 23 '23 at 16:58