I'm going through the proof of the Kalman filter equations in Shumway, Stoffer - Time Series Analysis and its applications. Could someone please tell me how equation (6.26) is justified? How can we say that the joint is normal? Could you please provide a reference for the result. For your convenience, here's the relevant chapter from the book. Thank you for your time.
https://www.stat.pitt.edu/stoffer/tsa4/Chapter6.pdf
Edit: On @jbowman's request, adding the math -
The state-equation in the basic Gaussian linear state-space model is given by
$ \mathbf{x}_t = \mathbf{\Phi}\mathbf{x}_t+\mathbf{w}_t, $ where $\mathbf{x}_t$ is a p-dimensional vector of reals and $\mathbf{w}_t$ is multivariate normal (MVN) with mean $\mathbf{0}$ and variance $\mathbf{Q}$.
The observation equation is given by $\mathbf{y}_t = \mathbf{A}_t\mathbf{x}_t + \mathbf{v}_t$, where $\mathbf{y}_t$ is a q-dimensional vector of reals, $\mathbf{A}_t$ is a $q\times p$ matrix, and $\mathbf{v}_t$ is MVN with mean zero and variance $\mathbf{R}$.
Suppose $\mathbf{x}_0$ is the initial state vector with mean 0 and variance $\mathbf{\Sigma}_0$. Further suppose $\mathbf{x}_0$, $\mathbf{w}_t$, and $\mathbf{v}_t$ are uncorrelated. Let $\mathbf{x}^s_t = E(\mathbf{x}_t\mid \mathbf{y}_{1:s})$ and $P^s_t$ be the variance of $X_t\mid \mathbf{y}_{1:s}$.
The Kalman filter is given by
$\mathbf{x}^{t-1}_t = \mathbf{\Phi}\mathbf{x}^{t-1}_{t-1}$.
$P^{t-1}_t = \mathbf{\Phi}P^{t-1}_{t-1}\mathbf{\Phi}^{T} + \mathbf{Q}$.
$\mathbf{x}^t_t = \mathbf{x}^{t-1}_t + K_t(\mathbf{y}_t-\mathbf{A}_t\mathbf{x}^{t-1}_t)$.
$P^t_t = (I-K_t\mathbf{A}_t)P^{t-1}_t$,
where $K_t = P^{t-1}_t\mathbf{A}_t^{T}(\mathbf{A}_tP^{t-1}_t\mathbf{A}_t^{T}+\mathbf{R})^{-1}$.
The first two equations above can be easily obtained by expanding the definition of $\mathbf{x}^{t-1}_t$ and $P^{t-1}_t$, respectively.
Consider the regression of $\mathbf{y}_t$ on $\mathbf{y}_{1:(t-1)}$ and define the residual $\mathbf{\epsilon}_t = \mathbf{y}_t - E(\mathbf{y}_t\mid\mathbf{y}_{1:(t-1)}) = \mathbf{y}_t - \mathbf{A}_t\mathbf{x}^{t-1}_t$. It can be shown by a straight-forward expansion of the definition that $Cov(\mathbf{x}_t, \mathbf{\epsilon}_t\mid \mathbf{y}_{1:(t-1)}) = P^{t-1}_t\mathbf{A}_t^{T}$.
The proof in the book claims that the conditional of $\mathbf{x}_t$ and $\mathbf{\epsilon}_t$, given $\mathbf{y}_{1:(t-1)}$ is MVN and the final equation for $\mathbf{x}^t_t$ Kalman update is obtained by conditioning $\mathbf{x}_t$ on $(\mathbf{\epsilon}_t, \mathbf{y}_{1:(t-1)})$, using standard results for MVN (see: https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Conditional_distributions).
The question here is: why is the conditional of $\mathbf{x}_t$ and $\mathbf{\epsilon}_t$, given $\mathbf{y}_{1:(t-1)}$ MVN? Can we show it?