4

Let $\mathbf{x} \sim \mathcal{N}(\boldsymbol{\mu}, \Sigma)$ be a random vector following a multivariate normal density distribution. I am interested in the density function of the transformed variable

$$z = \mathbf{x}^\top \mathbb{A} \mathbf{x} + \mathbf{b}^\top \mathbf{x}$$

where $\mathbb{A}$ is a constant matrix and $\mathbf{b}$ a constant vector. Is there a name for the distribution associated with $z$?

I presume this related to the Chi-squared distribution, which is a special case of this problem. But I'm not sure my problem can be reduced to the Chi-squared distribution?

Any further ideas?

Update: Upon further thought and thanks to @whuber comment, I see the distribution of $z$ might not have a closed form. A less ambitious goal is then to compute the moments of $z$, starting with the mean, variance, ....

The mean is actually easy:

$$\langle z\rangle = \mathrm{Tr}(\mathbb{A}\Sigma) + \boldsymbol{\mu}^{\top}\mathbb{A}\boldsymbol{\mu} + \mathbf{b}^{\top}\mathbf{u}$$

a06e
  • 4,410
  • 1
  • 22
  • 50
  • 1
    Note that even $x'Ax$ does not have to be $\chi^2$ distributed. Check the famous Cochran theorem. – Zhanxiong Dec 12 '22 at 23:26
  • 4
    It is a sum of a scaled non-central Chi-squared distribution and scaled versions of Chi-squared distributions. It has no simple form unless $\mathbb A$ is a multiple of the identity. When $b=0,$ all the distributions involved are Chi-squared -- equivalently, they are scaled $\Gamma(1/2)$ distributions -- and the thread at https://stats.stackexchange.com/questions/72479 shows what their sum looks like. – whuber Dec 12 '22 at 23:26
  • @whuber Thanks. I realize the full distribution has no closed-form. So I decrease my ambitions. I would like now to compute the first moments of $z$ (mean, variance, maybe higher). – a06e Dec 13 '22 at 11:33
  • 2
    You can find answers in recent threads initiated by Yaroslav Bulatov. Alternatively, upon diagonalizing $\mathbb A + \mathbb A^\prime,$ recognize that $z$ is a constant plus a linear combination of independent non-central Chi-squared$(1)$ distributions and apply the usual rules for computing with covariances. – whuber Dec 13 '22 at 14:29
  • 1
    The mean you presented should be (for general $A$ not necessarily symmetric) $E[z] = \frac{1}{2}\operatorname{Tr}((A + A^\top)\Sigma) + \frac{1}{2}\mu^\top(A + A^\top)\mu + b^\top\mu$. – Zhanxiong Dec 13 '22 at 16:00
  • Thanks! @Zhanxiong I think you're missing some 1/2 factors ? – a06e Dec 13 '22 at 20:40
  • @becko What do you mean? I had $1/2$ in the mean expression. – Zhanxiong Dec 13 '22 at 20:44
  • I think I should get your expression if I replace $A$ in mine, with $(A + A^T) / 2$, don't you agree? If I do that, I get extra 1/2 factors (so 1/4). – a06e Dec 13 '22 at 20:47
  • @becko Are you talking about the mean or variance? – Zhanxiong Dec 13 '22 at 20:52
  • The mean. I have done some numerical checks which seem to agree with my expression. – a06e Dec 13 '22 at 20:58
  • @becko No. You expression doesn't agree with the well known result when $A$ is symmetric: $E[z] = \operatorname{tr}(A\Sigma) + \mu'A\mu + b'\mu$. The "$1/2$" in your expression is redundant if you do not include the $A'$. – Zhanxiong Dec 13 '22 at 21:07

1 Answers1

3

$\DeclareMathOperator{\Var}{Var}$ $\DeclareMathOperator{\Cov}{Cov}$ $\DeclareMathOperator{\tr}{tr}$

The variance of $z$ can be computed as follows (suppose $A$ is symmetric, which is standard when the quadratic form is discussed. For the non-symmetric case, rewrite $z$ as $z = x'Bx + b'x$, where $B = (A + A')/2$ and replace $A$ below with the symmetric matrix $B$).

In this thread, it is shown that when $x \sim N(\mu, \Sigma)$ and $A$ is symmetric, \begin{align} \Var(x'Ax) = 2\tr((A\Sigma)^2) + 4\mu'A\Sigma A\mu. \end{align} Therefore, to evaluate \begin{align} \Var(z) = \Var(x'Ax + b'x) = \Var(x'Ax) + \Var(b'x) + 2\Cov(x'Ax, b'x), \tag{1} \end{align} it suffices to evaluate \begin{align} \Cov(x'Ax, b'x) = E[x'Axb'x] - E[x'Ax]E[b'x]. \end{align}

Since $E[x'Ax] = \mu'A\mu + \tr(A\Sigma), E[b'x] = b'\mu$, the only difficult part left is $E[x'Axb'x]$. To this end, write $x = \mu + y$, where $y \sim N(0, \Sigma)$, then \begin{align} & x'Axb'x = (\mu + y)'A(\mu + y)b'(\mu + y) \\ =& (\mu'A\mu + 2\mu'Ay + y'Ay)(b'\mu + b'y) \\ =& \mu'A\mu b'\mu + \mu'A\mu b'y + 2b'\mu\mu'Ay + 2\mu'Ayb'y + b'\mu y'Ay + y'Ayb'y \end{align}

In my answer to this thread, it is argued that for $y \sim N(0, \Sigma)$, we have \begin{align} E[y'Ayb'y] = 0. \end{align} In addition, \begin{align} & E[\mu'Ayb'y] = E[y'b\mu'Ay] = \tr(b\mu'A\Sigma) = \mu'A\Sigma b, \\ & E[b'\mu y'Ay] = b'\mu\tr(A\Sigma). \end{align}

It thus follows that \begin{align} & E[x'Axb'x] = \mu'A\mu b'\mu + 2\mu'A\Sigma b + b'\mu\tr(A\Sigma), \\ & \Cov(x'Ax, b'x) = \mu'A\mu b'\mu + 2\mu'A\Sigma b + b'\mu\tr(A\Sigma) - (\mu'A\mu + \tr(A\Sigma))b'\mu \\ &\phantom{\Cov(x'Ax, b'x)} = 2\mu'A\Sigma b. \end{align}

Substituting \begin{align} & \Var(x'Ax) = 2\tr((A\Sigma)^2) + 4\mu'A\Sigma A\mu, \\ & \Var(b'x) = b'\Sigma b, \\ & \Cov(x'Ax, b'x) = 2\mu'A\Sigma b \end{align} into $(1)$, we get \begin{align} \Var(z) = 2\tr((A\Sigma)^2) + 4\mu'A\Sigma A\mu + b'\Sigma b + 4\mu'A\Sigma b. \end{align}

Zhanxiong
  • 18,524
  • 1
  • 40
  • 73
  • I get a similar expression, but off by some of the factors, and also the second-to last term I have is different.. I will do some numerical verifications to check – a06e Dec 13 '22 at 20:43
  • @becko As I wrote at the beginning of this answer, if $A$ is non-symmetric, then you need to replace $A$ in the last expression with $(A + A')/2$ -- that may be what you meant by "off some of the factors". – Zhanxiong Dec 13 '22 at 20:49
  • This is the expression I get for the variance: $\frac{1}{2}tr(\mathbb{A\Sigma A\Sigma}) + \mathbf{u}^{\top}\mathbb{A\Sigma A}\mathbf{u} + 2\mathbf{b}^{\top}\mathbb{\Sigma A}\mathbf{u} + \mathbf{b}^{\top}\Sigma\mathbf{b}$, assuming $A$ is symmetric. – a06e Dec 13 '22 at 20:59
  • 1
    @becko This is not right. To see that, just set $b = 0$ first and compare your expression with this answer. – Zhanxiong Dec 13 '22 at 21:03
  • Ah sorry, it's my fault. I was working with the expression $ z = \frac{1}{2}\mathbf{x}^{\top}\mathbb{A}\mathbf{x} + \mathbf{b}^{\top}\mathbf{x} + c$ – a06e Dec 13 '22 at 21:16
  • So now I agree with your expression. – a06e Dec 13 '22 at 21:16
  • I still don't agree with the variance though ... if I take my expression in the previous comment, and multiply all the $A$ by 2, you see I never get the $6\mu A\mu b\mu$ term you get. – a06e Dec 13 '22 at 21:19