5

Given are 4 independent normal variables $x_i \sim \mathcal{N}(\mu_i,\sigma_i)_{i=1,\ldots,4}$, that are used to define the random variable $$Z\sim (x_1 - x_2 + x_3 + x_4)^2 - (-x_1 + x_2 + x_3 + x_4)^2.\tag{1}$$

Each squared term in eq.(1) is a linear combination of independent normal variables, whence each of those squares has a noncentral $\chi^2(k,M)$-distribution with $k=1$ degree of freedom and noncentrality parameter $M$. The two linear combinations are also orthogonal, $\begin{pmatrix} 1 & -1 & 1&1 \end{pmatrix} \cdot \begin{pmatrix} -1 & 1 & 1&1 \end{pmatrix}^T =0$ , whence uncorrelated; and because they are uncorrelated joint random variables, they are independent (see here).

Thus the squared linear combinations in eq.(1) can be written as a difference of $\chi^2$-variables $$Z\sim s \left( \chi^2\left(1,M_1\right)- \chi^2\left(1,M_2\right)\right)\tag{2}$$

with noncentrality parameters $M_1=\dfrac{1}{s}\left(\mu_1-\mu_2+\mu_3+\mu_4\right)^2, M_2=\dfrac{1}{s}\left(-\mu_1+\mu_2+\mu_3+\mu_4\right)^2$ and total variance $s=\sum_{i=1}^4\sigma^2_i$.

However this approach only works, if all $\sigma_i$ are identical. How to express eq.(1) by $\chi^2$-variables for different $\sigma_i$?

Graphical examples for PDF

All $\sigma_i$ are identical.

$\sigma=(5,5,5,5), \mu=(2,-11,-1,4)$, black: eq(1), red: eq(2) (both plots almost overlap)

All $\sigma_i$ are different.

$\sigma=(5,4,3,2), \mu=(2,-11,-1,4)$, black: eq(1), red: eq(2)

  • Short answer: $Z$ can be expressed as a quadratic form in four iid standard Normal variables. Diagonalize it, reducing the question to a weighted sum of squares of non-central chi-squared variates: you're done. – whuber Dec 26 '23 at 15:18
  • Even shorter. Let $Z_1=4(X_1-X_2)$ and $Z_2=X_3+X_4$. $Z_1$ and $Z_2$ are independent normals with respective means $4(\mu_1-\mu_2)$ and $\mu_3+\mu_4$ and respective variances $16(\sigma_1^2+\sigma_2^2)$ and $\sigma_3^2+\sigma_4^2$. Now $Z=(Z_1/4+Z_2)^2-(-Z_1+Z_2)^2=Z_1 Z_2$. Now you're just dealing with the product of two independent normals. – JimB Dec 26 '23 at 18:46
  • @JimB: Right -- but it's "even shorter" only for this special case. Generally, a quadratic form will be expressible as a linear combination of noncentral chi-squared variates, but not as a product of Normal variates. – whuber Dec 26 '23 at 20:37
  • @whuber I will amend my answer with that in mind. – JimB Dec 26 '23 at 20:46

3 Answers3

6

This problem is initially written as the difference in the squares of two correlated normals.

$$Z=(X_1-X_2+X_3+X_4)^2-(-X_1+X_2+X_3+X_4)^2$$

where all of the $X_i$ are independent random variables with potentially different means and variances: $X_i \sim N(\mu_i,\sigma_i^2)$.

The request is to re-write $Z$ in terms of the difference in two independent noncentral chisquare random variables times some constant:

$$Z=s(W_1-W_2)$$

where $W_i\sim \chi^2(1,m_i)$. To do so we do something similar to the idea presented in @whuber 's comment: The characteristic functions are found for both characterizations of $Z$ and then we equate all of the coefficients for the characteristic function argument $t$.

To have a less cluttered expression for

$$Z=(X_1-X_2+X_3+X_4)^2-(-X_1+X_2+X_3+X_4)^2$$

we first re-write $Z$ as follows:

$$Z=(Z_{12}+Z_{34})^2 - (-Z_{12}+Z_{34})^2$$

where $Z_{12}=X_1-X_2$ and $Z_{34}=X_3+X_4$. $Z_{12}$ and $Z_{34}$ are independent normals with respective means and variances $\mu_{12}=\mu_1-\mu_2$, $\mu_{34}=\mu_3+\mu_4$, $\sigma_{12}^2=\sigma_1^2+\sigma_2^2$, and $\sigma_{34}^2=\sigma_3^2+\sigma_4^2$.

Written this way the characteristic function of $Z$ is found with

Format[μ[i_]]:=Subscript[μ,i]
Format[σ[i_]]:=Subscript[σ,i]
dist = TransformedDistribution[(z[12] + z[34])^2 - (-z[12] + z[34])^2,
    {z[12] \[Distributed] NormalDistribution[μ[12], σ[12]],
     z[34] \[Distributed] NormalDistribution[μ[34], σ[34]]}];
cf = CharacteristicFunction[dist, t]

$$\frac{e^ {-\frac{4 t \left(-i \mu _{12} \mu _{34}+2 \mu _{34}^2 \sigma _{12}^2 t+2 \mu _{12}^2 \sigma _{34}^2 t\right)}{16 \sigma _{12}^2 \sigma _{34}^2 t^2+1}}}{\sqrt{16 \sigma _{12}^2 \sigma _{34}^2 t^2+1}}=\frac{e^{\frac{-t^2 \left(8 \mu _{34}^2 \sigma _{12}^2+8 \mu _{12}^2 \sigma _{34}^2\right)+i 4 \mu _{12} \mu _{34} t}{16 \sigma _{12}^2 \sigma _{34}^2 t^2+1}}}{\sqrt{16 \sigma _{12}^2 \sigma _{34}^2 t^2+1}}$$

The characteristic function for the alternate formulation is found with the following:

Format[m[i_]]:=Subscript[m,i]
dist2 = TransformedDistribution[s (w[1] - w[2]),
   {w[1] \[Distributed] NoncentralChiSquareDistribution[1, m[1]],
    w[2] \[Distributed] NoncentralChiSquareDistribution[1, m[2]]}];
cf2 = CharacteristicFunction[dist2, t];

$$\frac{e^{-s t \left(\frac{m_1}{2 s t+i}+\frac{m_2}{2 s t-i}\right)}}{\sqrt{4 s^2 t^2+1}}=\frac{e^{\frac{-2 \left(m_1+m_2\right) s^2 t^2+i \left(m_1-m_2\right) s t}{4 s^2 t^2+1}}}{\sqrt{4 s^2 t^2+1}}$$

We can find the matching terms of $t$ and $t^2$ in the exponents to determine $m_1$, $m_2$, and $s$. This means we need to simultaneously solve

$$16 \sigma^2_{12} \sigma^2_{34}=4s^2$$ $$8 \mu^2_{34}\sigma^2_{12}+8\mu^2_{12}\sigma^2_{34}=2s^2(m_1+m_2)$$ $$4\mu_{12}\mu_{34}=s(m_1-m_2)$$

sol = Solve[{16 σ[12]^2 σ[34]^2 == 4 s^2, 
  8 μ[34]^2 σ[12]^2 + 8 μ[12]^2 σ[34]^2 == 2 s^2 (m[1] + m[2]), 
  4 μ[12] μ[34] == s (m[1] - m[2])}, {s, m[1], m[2]}]

This results in two solutions:

$$s= \pm 2 \sigma _{12} \sigma _{34}$$ $$m_1= \frac{\left(\mu _{34} \sigma _{12}\pm\mu _{12} \sigma _{34}\right){}^2}{2 \sigma _{12}^2 \sigma _{34}^2}$$ $$m_2= \frac{\left(\mu _{34} \sigma _{12}\mp\mu _{12} \sigma _{34}\right){}^2}{2 \sigma _{12}^2 \sigma _{34}^2}$$

Finally, we can plug in the original parameters:

parms = {μ[12] -> μ[1] - μ[2], μ[34] -> μ[3] + μ[4], σ[12] -> Sqrt[σ[1]^2 + σ[2]^2],  σ[34] -> Sqrt[σ[3]^2 + σ[4]^2]}
sol = sol //. parms

$$s= \pm 2 \sqrt{\sigma _1^2+\sigma _2^2} \sqrt{\sigma _3^2+\sigma _4^2}$$ $$m_1= \frac{\left(\left(\mu _2-\mu _1\right) \sqrt{\sigma _3^2+\sigma _4^2}\pm\left(\mu _3+\mu _4\right) \sqrt{\sigma _1^2+\sigma _2^2}\right){}^2}{2 \left(\sigma _1^2+\sigma _2^2\right) \left(\sigma _3^2+\sigma _4^2\right)}$$ $$m_2= \frac{\left(\left(\mu _1-\mu _2\right) \sqrt{\sigma _3^2+\sigma _4^2}\mp \left(\mu _3+\mu _4\right) \sqrt{\sigma _1^2+\sigma _2^2}\right){}^2}{2 \left(\sigma _1^2+\sigma _2^2\right) \left(\sigma _3^2+\sigma _4^2\right)}$$

As a check consider one of the given examples.

parms = {μ[1] -> 2, μ[2] -> -11, μ[3] -> -1, μ[4] -> 4, σ[1] -> 5,
  σ[2] -> 4, σ[3] -> 3, σ[4] -> 2};
parms1 = {μ[12] -> μ[1] - μ[2], μ[34] -> μ[3] + μ[4], σ[12] -> Sqrt[σ[1]^2 + σ[2]^2],
  σ[34] -> Sqrt[σ[3]^2 + σ[4]^2]};
z1 = RandomVariate[dist /. parms1 /. parms, 1000000];

dist2 = TransformedDistribution[ s (w[1] - w[2]) /. sol[[2]] /. parms1 /. parms, {w[1] [Distributed] NoncentralChiSquareDistribution[1, m[1]], w[2] [Distributed] NoncentralChiSquareDistribution[1, m[2]]} /. sol[[2]] /. parms1 /. parms]; z2 = RandomVariate[dist2 /. parms, 1000000];

Histogram[z1, "FreedmanDiaconis", "PDF"]

Histogram of samples from original distribution

Histogram[z2, "FreedmanDiaconis", "PDF"]

Histogram of transformed distribution

We can also check on the corresponding sample means and variances:

{Mean[#], Variance[#]} & /@ {z1, z2}
(* {{156.019, 49649.8}, {156.27, 49587.}} *)

The pdf can also be easily found numerically using the characteristic function for specified values of the parameters.

Answer to a comment below:

It was asked to find the distribution and characteristic function of a more complicated random variable:

$$X=(z_5-z_3)(z_2-z_4)-(z_1-z_3)(z_6-z_4)$$

Rather than directly using that definition, it appears that Mathematica needs a little help.

First determine the multivariate normal distribution of the 4 differences in normal random variables.

dist = TransformedDistribution[{z[5] - z[3], z[2] - z[4], z[1] - z[3], z[6] - z[4]},
  {z[1] \[Distributed] NormalDistribution[μ[1], σ[1]], 
   z[2] \[Distributed] NormalDistribution[μ[2], σ[2]], 
   z[3] \[Distributed] NormalDistribution[μ[3], σ[3]], 
   z[4] \[Distributed] NormalDistribution[μ[4], σ[4]], 
   z[5] \[Distributed] NormalDistribution[μ[5], σ[5]], 
   z[6] \[Distributed] NormalDistribution[μ[6], σ[6]]}]

(* MultinormalDistribution[{-μ[3] + μ[5], μ[2] - μ[4], μ[1] - μ[3], -μ[4] + μ[6]}, {{σ[3]^2 + σ[5]^2, 0, σ[3]^2, 0}, {0, σ[2]^2 + σ[4]^2, 0, σ[4]^2}, {σ[3]^2, 0, σ[1]^2 + σ[3]^2, 0}, {0, σ[4]^2, 0, σ[4]^2 + σ[6]^2}}]*)

Now set up rules such that the parameters in the above distribution each have a single symbol. This seems to be necessary for Mathematica to complete in a reasonable amount of time.

parms1 = {-μ[3] + μ[5] -> μ1, μ[2] - μ[4] -> μ2, μ[1] - μ[3] -> μ3, -μ[4] + μ[6] -> μ4,
  σ[3]^2 + σ[5]^2 -> v1, σ[2]^2 + σ[4]^2 -> v2, σ[1]^2 + σ[3]^2 -> v3, 
  σ[4]^2 + σ[6]^2 -> v4, σ[3]^2 -> v5, σ[4]^2 -> v6};

(* Find distribution and characteristic function of the simplified difference of the products of normally distributed random variables *) dist2 = TransformedDistribution[x[1] x[2] - x[3] x[4], {x[1], x[2], x[3], x[4]} [Distributed] dist] /.parms1; AbsoluteTiming[cf = CharacteristicFunction[dist2, t]]

The result is messy and takes about 4 minutes to run. Now replace with the original parameters:

parms2 = parms1 /. {Rule[a_, b_] -> Rule[b, a]}
cf = cf /. parms2

That result is even messier so I won't show it here.

JimB
  • 3,734
  • 11
  • 20
  • Sorry for the confusion. An earlier version used $Z_1$ and $Z_2$ but in an attempt to make things clearer, I switched those to $Z_{12}$ and $Z_{34}$ but obviously didn't make all of the changes in the text. I'm in the process of simplifying my answer and should have that done later today. – JimB Jan 12 '24 at 17:29
  • What case is taking a long time? The characteristic functions cf and cf2 take 13 seconds and 0.03 seconds, respectively. – JimB Jan 16 '24 at 04:26
  • If one rewrites that random variable as$ x_1 x_2-x_3 x_4$ where $(x_1,x_2,x_3,x_4)$ has multiivariate normal distribution, then the characteristic functions takes about 6 minutes to complete. Sometimes Mathematica needs a little help. (In other words, $x_1=z_5-x_3$, $x_2=z_2-z_4$, etc.) – JimB Jan 18 '24 at 04:10
  • Sorry about that. Yes, $x_1=z_5-x_3$ should be $x_1=z_5-z_3$. – JimB Jan 18 '24 at 16:31
  • I've given you all of the information necessary. I hope it works out. – JimB Jan 18 '24 at 16:58
  • $(x_1,x_2,x_3,x_4)$ as stated has a multivariate normal distribution. I gave the definitions of $x_1$ and $x_2$ but figured you could see that $x_3=z_1-z_3$ and $x_4=z_6-z_4$. Mathematica can handle this as there are just 4 random variables rather than 6. – JimB Jan 18 '24 at 18:45
  • $(x_1,x_2,x_3,x_4)$ has a multivariate normal distribution with each element ($x_1$, $x_2$, $x_3$, and $x_4$) having a univariate normal distribution. – JimB Jan 19 '24 at 17:48
  • Don't know what happened there. It is now fixed. – JimB Jan 19 '24 at 21:25
  • I'm not doing well with my cutting-and-pasting. I left off the /.parms1 from the definition of dist2. With that fix I, again, get the result in 4 minutes on my older PC and 6 minutes on my laptop. I'll enter the edit a bit later today. You might want to add in that additional question in your original post or start a new post. – JimB Jan 20 '24 at 23:17
  • 1
    The assignments σ[3]^2 -> v5, σ[4]^2 -> v6 in parms1 seem to be either wrong, incomplete or superficial or I do not understand its meaning. – granular_bastard Jan 24 '24 at 23:30
  • Why do you say that? As you've seen I do make mistakes but I need something more than a feeling as to why those assignments are wrong. Those ($\sigma_3^2$ and $\sigma_4^2$ are terms in the off-diagonal of the covariance matrix. The whole point of the reassignment using parms1 was to make the structure as simple as possible for Mathematica. Possibly those weren't necessary but I wasn't going to try all combinations. Those re-assignments minimized the number of symbols and operations in the covariance matrix. – JimB Jan 24 '24 at 23:37
  • Sorry, you did find another error. I cut-and-pasted incorrectly again. σ[2]^2 + σ[4]^2 -> v3 should be σ[2]^2 + σ[4]^2 -> v2. – JimB Jan 24 '24 at 23:48
3

For this special case the random variable $Z$ can be simplified to the product of two independent normals:

Let $Z_1=4(X_1-X_2)$ and $Z_2=X_3+X_4$. $Z_1$ and $Z_2$ are independent normals with respective means $4(\mu_1-\mu_2)$ and $\mu_3+\mu_4$ and respective variances $16(\sigma_1^2+\sigma_2^2)$ and $\sigma_3^2+\sigma_4^2$. Now

$$Z=(Z_1/4+Z_2)^2-(-Z_1/4+Z_2)^2=Z_1 Z_2$$

Finding moments of $Z$ should be straightforwared. To plot the pdf of $Z$ using Mathematica (which your profile says you have access), one can execute the following:

(* Numerical estimate of the pdf of Z *)
pdf[z_?NumericQ, μ1_?NumericQ, σ1_?NumericQ, μ2_?NumericQ, σ2_?NumericQ] := 
 NIntegrate[PDF[NormalDistribution[μ1, σ1], x] PDF[NormalDistribution[μ2, σ2], z/x]/Abs[x],
 {x, -∞, ∞}]

(* Parameters from your example *) σz1 = Sqrt[16 (25 + 16)] σz2 = Sqrt[9 + 4] μz1 = 4 (2 - (-11)) μz2 = -1 + 4

(* Plot of density *) Plot[pdf[z, μz1, σz1, μz2, σz2], {z, -400, 1000}]

PDF of Z using OP's example parameters

Expressing Z as a linear combination of two independent noncentral chisquare rv's

Define the distribution of the linear combination.

$$Z=a_1\chi^2(1,m_1)+a_2\chi^2(1,m_2)$$

dist = TransformedDistribution[a1 y1 + a2 y2,
   {y1 \[Distributed] NoncentralChiSquareDistribution[1, m1], 
    y2 \[Distributed] NoncentralChiSquareDistribution[1, m2]}];

Define the distribution of Z expressed as the product of two independent normal rv's.

distz1z2 = TransformedDistribution[z1 z2,
  {z1 \[Distributed] NormalDistribution[μz1, σz1], 
   z2 \[Distributed] NormalDistribution[μz2, σz2]}];

If we can characterize $Z$ as a linear combination of two independent noncentral $\chi^2$ rv's which has 4 parameters, then the first 4 moments need to match with that of the characterization of $Z$ as the product of two independent normals.

sol = Solve[{Mean[distz1z2] == Mean[dist], 
     Variance[distz1z2] == Variance[dist],
     Moment[distz1z2, 3] == Moment[dist, 3], 
     Moment[distz1z2, 4] == Moment[dist, 4]},
    {a1, a2, m1, m2}][[1]] // FullSimplify

Solution to parameters of linear combination

From above we know that

parms2 = {σz1 -> Sqrt[16 (σ1^2 + σ2^2)], σz2 -> Sqrt[σ3^2 + σ4^2], 
  μz1 -> 4 (μ1 - μ2), μz2 -> μ3 + μ4};

So in terms of the original parameters the parameters of the linear combination of independent noncentral chisquare rv's are

sol /. parms2

Solution in terms of original parameters

As a minimal check, consider your example.

parms = {μ1 -> 2, μ2 -> -11, μ3 -> -1, μ4 -> 4, σ1 -> 5, σ2 -> 4, σ3 -> 3, σ4 -> 2};
zA = RandomVariate[distz1z2 /. sol /. parms2 /. parms, 1000000];
zB = RandomVariate[dist /. sol /. parms2 /. parms, 1000000];
Histogram[zA, "FreedmanDiaconis", "PDF", PlotRange -> {{-700, 1300}, {0, 0.003}}]

Histogram of density for product of two independent normals

Histogram[zB, "FreedmanDiaconis", "PDF", PlotRange -> {{-700, 1300}, {0, 0.003}}]

Histogram of density for linear combination of two noncentral chisquare random variables

JimB
  • 3,734
  • 11
  • 20
  • Can the answer solely be expressed by $\chi^2$ variables as desired by the OP? – granular_bastard Dec 27 '23 at 20:55
  • I know that's what you asked for but what I've presented is an approach to obtain the exact distribution. However, I'll take a look at if the distribution of $Z$ can be expressed as a linear combination of two independent noncentral $\chi^2$ variables. If so, that will just be equivalent to characterizing the distribution of $Z$ as the product of two independent normals. – JimB Dec 27 '23 at 23:11
  • How to solve the problem without Mathematica? For this case, accidentially, after 5 minutes of calculation, Mathematica returns a result. But we do not learn anything about the steps and therefore do not know the limit of the method. – granular_bastard Dec 28 '23 at 13:59
  • “accidentally” ? Ha, ha. You have the answer. Like is too short for me to spend time on such things. And get a faster computer. – JimB Dec 28 '23 at 14:51
  • How can Mathematica solve this if we instead of the product distribution of 2 normals use the original form (eq.1) as input for the code. – granular_bastard Jan 02 '24 at 21:15
  • Can you help me understand why that would be necessary? Why not $Z=(x_1-x_2 + x_3)^2-(-x_1+x_2+x_3)^2$? Without the rationale for doing so, it appears that you want to make things more complicated than necessary. Unless...there's a different problem that you want to solve that does require something more complicated. – JimB Jan 02 '24 at 21:50
  • It would be interesting to see how Mathematica can solve it for your difference of squared normals. – granular_bastard Jan 03 '24 at 19:07
  • After 30h on a modern CPU Mathematica is still calculating. – granular_bastard Jan 06 '24 at 15:07
  • If you could show the code that is taking so long, that would be helpful. – JimB Jan 08 '24 at 22:00
  • dist = TransformedDistribution[ a1 y1 + a2 y2, {y1 [Distributed] NoncentralChiSquareDistribution[1, m1], y2 [Distributed] NoncentralChiSquareDistribution[1, m2]}]; distz1z2 = TransformedDistribution[(z1 - z2 + z3)^2 - (-z1 + z2 + z3)^2, {z1 [Distributed] NormalDistribution[[Mu]z1, [Sigma]z1], z2 [Distributed] NormalDistribution[[Mu]z2, [Sigma]z2], z3 [Distributed] NormalDistribution[[Mu]z3, [Sigma]z3]}]; – granular_bastard Jan 09 '24 at 17:33
  • sol = Solve[{Mean[distz1z2] == Mean[dist], Variance[distz1z2] == Variance[dist], Moment[distz1z2, 3] == Moment[dist, 3], Moment[distz1z2, 4] == Moment[dist, 4]}, {a1, a2, m1, m2}][[1]] // FullSimplify – granular_bastard Jan 09 '24 at 17:33
  • Hmmm. I now wonder if it is because $a_1=-a_2$ and therefore trying to fit 4 parameters when only 3 are needed is the problem. Fitting a y1 - a y2 and using just the first 3 moments works almost instantly even if using (z1 - z2 + z3 + z4)^2 - (-z1 + z2 + z3 + z4)^2. – JimB Jan 09 '24 at 19:38
  • The case $a1=-a2$ is assumed in the OP however fails. Therefore it is not known beforehand that $a1=-a2$. – granular_bastard Jan 10 '24 at 00:38
  • That case doesn't fail for me. I do agree that it was not known beforehand that $a_1=-a_2$. But that happens sometimes when a model is unknowingly overparameterized. – JimB Jan 10 '24 at 21:23
  • As a check...if you plug in any 8 numbers for $\mu_1$, $\mu_2$, $\mu_3$, $\mu_4$, $\sigma_1$, $\sigma_2$, $\sigma_3$, and $\sigma_4$ (with the standard deviations all being positive), you'll find that you can get the exact same characteristic function for $s(\chi^2(1,m_1)-\chi^2(1,m_2)$ with only the 3 parameters $s$, $m_1$, and $m_2$. – JimB Jan 10 '24 at 23:02
2

This answer only explains why the approach fails but does not say how to get the expected result.

Set $$y_i\sim \mathcal{N}\left(\dfrac{\mu_i}{\sigma_i},1\right)$$ and write $$Z\sim(\sigma_1 y_1-\sigma_2 y_2+\sigma_3 y_3+\sigma_4 y_4)^2-(-\sigma_1 y_1+ \sigma_2 y_2+\sigma_3 y_3+\sigma_4 y_4)^2.$$ You see that the linear combinations are not orthogonal but dependent. They are independent only for the special case $\sigma^2_1+\sigma^2_2=\sigma^2_3+\sigma^2_4$.