The proof for the first question requires only some simple properties of the trace operator, the expected value, and the definition of the covariance matrix (see here). First, as mentioned in the title, $||\mathbb{\beta}||^2$ is a quadratic form, given by $||\mathbb{\beta}||^2 = \beta^T \beta$. Then we have:
$$ \mathbb{E}\left( \beta^T \beta \right) = \mathbb{E}\left( tr(\beta^T \beta) \right) \\ = \mathbb{E} \left( tr(\beta \beta^T) \right) \\
= tr \left(\mathbb{E} \left( \beta \beta^T \right) \right) \\
= tr \left(\Sigma + \mu \mu^T\right) \\
= tr \left(\Sigma \right) + tr \left(\mu \mu^T\right) \\
= tr \left(\Sigma \right) + tr \left(\mu^T \mu \right) \\
= tr \left(\Sigma \right) + \mu \mu^T = tr \left(\Sigma \right) + ||\mu||^2
$$
Here we just used a couple of basic probability and linear algebra properties:
- In line 1, because $\beta^T \beta$ is a scalar, $\beta^T \beta = tr(\beta^T \beta)$
- In line 2, we use the equality for the trace of a product that says $tr(A^TB) = tr(AB^T)$
- Because the trace is linear, we can move the expected value operator inside the trace
- From the definition of the covariance matrix, it can be shown that $\mathbb{E} \left( \beta \beta^T \right) = \mathbb{E}((\beta-\mu)(\beta-\mu)^T) + \mu\mu^T = \Sigma + \mu\mu^T$
- We use that the trace is linear, i.e. $tr(A+B) = tr(A) + tr(B)$.
- We again use the fact that $tr(A^TB) = tr(AB^T)$
- Because $\mu^T \mu$ is a scalar, we remove the trace operator (same as step 1), and $\mu^T \mu = ||\mu||^2$