0

In linear regression, if we assume the error term follows a gaussian distribution of zero mean, and assume we are using mean squared loss, then we can show that this minimisation will lead to finding optimal parameters that estimates the conditional expectation of the target $y$ given $x$.

I have no trouble with this as the estimation can be shown with maximum likelihood, I have troubles constructing that the expectation of the ground truth $y$ conditioned on $x$ is indeed linear (in parameters).

Is the following idea correct?

It can be shown that the conditional distribution of $y$ given $\mathbf{x}$ is also normally distributed:

$$ \begin{aligned} y \mid \mathbf{x} &\sim \mathcal{N}\left(y \mid f\left(\mathbf{x}; \boldsymbol{\theta}\right), \sigma^2\right) \\ &= \mathcal{N}\left(y \mid f\left(\mathbf{x}; \boldsymbol{\theta}\right), \boldsymbol{\beta}^{-1}\right) \\ &= \mathcal{N}\left(\beta_0 + \boldsymbol{\beta}^T \mathbf{x}, \boldsymbol{\beta}^{-1}\right) \\ \end{aligned} $$

The proof here shows this step.

If this is true, then can I say $\mathbb{E}[y \mid \mathbf{x}]$ is the mean $\beta_0 + \boldsymbol{\beta}^{T}\mathbf{x}$ by definition of Gaussian.

nan
  • 825
  • 1
    The actual idea behind the regression is to model the response variable conditioned on the explanatory variable, i.e. $Y|X=\beta_0+\beta_1X+e$. If you can find the $Y|X$ distribution, $E(Y|X)$ is readily available. In the case of Gaussian distribution, it will be $\mu$ – DevD Mar 15 '23 at 06:46
  • 1
    Even if you remove the assumption of Normality about the residual $e$, the expectation of $Y|X$ will still be a linear function of parameters given that $E(e)=0$ and $X$ is not random. – DevD Mar 15 '23 at 06:51
  • If $X$ is not random, as most texts assume, then it is not necessary that $X$ is drawn i.i.d.? I am having a hard time since I see both formulations across different reference books. – nan Mar 15 '23 at 06:55
  • The i.i.d. assumption of linear regression is actually on the errors. There is no strict assumption stating that $Y$ or $X$ have to be i.i.d. We consider i.i.d., rather random sample for a true representation of the population. – DevD Mar 15 '23 at 07:19
  • Let me rephrase my previous comment. Observations should be independent, but no assumption states that $X$ has to be identically distributed. You can follow [https://stats.stackexchange.com/questions/220507/linear-regression-conditional-expectations-and-expected-values] to see why $X$ is not considered as a random variable in regression analysis. – DevD Mar 15 '23 at 07:27

0 Answers0