0

I am having trouble "mapping" the variables in the Bayes equation onto the case of regression. As notation, say $$ P(\theta|D) = \frac{P(D|\theta) P(\theta)}{ P(D) } $$ I have come to think of $\theta$ as parameters of a compact model.

Notation for regression: $$ y = f(x,\theta) $$ or $$ y = f_{\theta}(x) $$

In regression we want to estimate both $y$ (after training) and $\theta$ (during training/fitting). Is the posterior one of these:

  • $p(\theta|y,x)$

  • $p(y|\theta,x)$

  • $p(y,\theta | x)$

  • $p(\theta|x)$

Or perhaps the idea of posterior does not apply in typical regression?

Bull
  • 163

1 Answers1

1

The posterior will be $p(\theta|y,x)$, the posterior over the weights. The reason for this is you're not estimating $y$, you're estimating $\theta$. You can then use the posterior over the weights to ${predict}$ a new $y$ given a new $x$, using the posterior predictive distribution $$p(y_{new}|y_{previous},x_{previous},x_{new})=\int p(y_{new}|x_{new},\theta)p(\theta|y_{previous},x_{previous})d\theta.$$

It helps to write out explicitly what your likelihood and priors are. If you're placing a prior on a parameter then you're going to have a posterior over that parameter. You wouldn't place a prior on $y$, so why would you have a posterior over $y$? Try looking at the wikipedia page for bayesian linear regression (https://en.wikipedia.org/wiki/Bayesian_linear_regression) or one of the previously asked questions on this website (Bayes regression: how is it done in comparison to standard regression?).

aleshing
  • 1,588