Once again, here I am. Given the multiple linear regression model \begin{align*} \textbf{Y} = \textbf{X}\beta + \epsilon \end{align*}
where $\epsilon\sim\mathcal{N}(\textbf{0},\sigma^{2}\textbf{I})$ and $\mu = \textbf{X}\beta$, why do we need to determine the distribution of $\textbf{Y}$? If we apply the least square method to obtain $\hat{\beta}$, we get the explicit relation \begin{align*} Y_{i} = \hat{\beta}_{0} + \hat{\beta}_{1}x_{i1} + \ldots + \hat{\beta}_{p-1}x_{i,p-1} + \epsilon_{i} \end{align*}
from whence we are able to obtain the value of the response variable $Y$ in terms of the explanatory variables. My second question is: how do we interpret each component of $\textbf{Y} = (Y_{1},Y_{2},\ldots,Y_{n})$? Does each $Y_{i}$ represent the outcome from a different sample? Otherwise, if they belong to the same sample, why do they have different means?
Linear regression makes no assumptions on the distribution of the marginal outcome. However, there is an assumption on the distribution of the elements of Y. " This is backwards - the variable Y can have any unspecified distribution, but the marginal distribution (conditional on XB, i.e. the distribution of the error term) must be Gaussian. You correctly represent it later in your post.
– Joey F. May 16 '19 at 15:32