In linear regression, if we assume the error term follows a gaussian distribution of zero mean, and assume we are using mean squared loss, then we can show that this minimisation will lead to finding optimal parameters that estimates the conditional expectation of the target $y$ given $x$.
I have no trouble with this as the estimation can be shown with maximum likelihood, I have troubles constructing that the expectation of the ground truth $y$ conditioned on $x$ is indeed linear (in parameters).
Is the following idea correct?
It can be shown that the conditional distribution of $y$ given $\mathbf{x}$ is also normally distributed:
$$ \begin{aligned} y \mid \mathbf{x} &\sim \mathcal{N}\left(y \mid f\left(\mathbf{x}; \boldsymbol{\theta}\right), \sigma^2\right) \\ &= \mathcal{N}\left(y \mid f\left(\mathbf{x}; \boldsymbol{\theta}\right), \boldsymbol{\beta}^{-1}\right) \\ &= \mathcal{N}\left(\beta_0 + \boldsymbol{\beta}^T \mathbf{x}, \boldsymbol{\beta}^{-1}\right) \\ \end{aligned} $$
The proof here shows this step.
If this is true, then can I say $\mathbb{E}[y \mid \mathbf{x}]$ is the mean $\beta_0 + \boldsymbol{\beta}^{T}\mathbf{x}$ by definition of Gaussian.