6

I believe there is a misconcept in my mind about the $\chi^2$ and/or standard normal distribution. Hence, I would like you to help me to understand what does it means that the $\chi_k^2$ distribution is a sum of $k$ independent, squared, standard, normal distributions.

In fact, according with many online sources, such as Wikipedia, the $\chi^2$ is

$$ \chi_k^2 = \sum_{i=1}^k Z_i^2 $$

where $Z_i$ are the $k$ different standard normal distributions. Furthermore, according with such sources, in some ways this sum becomes

$$ f(x;k) = \frac{ x^{ \frac{k}{2} - 1 }e^{ -\frac{x}{2} } }{ 2^{\frac{k}{2}}\Gamma\left(\frac{k}{2}\right) }\quad\textrm{ if }x>0,\qquad 0\qquad\textrm{ otherwise} $$

My problem is in what follows:

As far as I know, the normal distribution is

$$ Z = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}} $$

and it gets "normalized" with a transformation that makes $\mu = 0$ and $\sigma = 1$, which is

$$ Z(0,1) = \frac{1}{\sqrt{2\pi}}e^{-\frac{x^2}{2}} $$

Now, the STANDARD normal distribution has no parameters at all, which means that in the $\chi_k^2$ distribution the sum could be replaced just with a multiplication

$$ \chi_k^2 = kZ^2 $$

which has to be wrong, otherwise there would be not such a complicated defition of the $\chi^2$ distribution.

What's wrong in all of this? Please, let me know

  • 4
    There is a difference in distribution between adding $n$ i.i.d. random variables and multiplying one of them by $n$. For example, the former has variance $n\sigma^2$ and the latter $n^2\sigma^2$. The shape can change too (unless they are "stable" - e.g. normal) – Henry Aug 19 '22 at 08:06
  • in some ways this sum becomes... it happens in a legit way. You can take resort to the uniqueness of MGF to derive the distribution of a gamma variate with the concerned parameters. – User1865345 Aug 19 '22 at 08:14

1 Answers1

14

Sampling one value from $$ \sum_{i=1}^k Z_i^2 $$ requires to make one draw from $Z_1$, one draw from $Z_2$, and so forth. In other words, you must make $k$ independent draws from the $N(0, 1)$ distribution.

On the other hand, sampling one value from $$ kZ^2 $$ requires to make one single draw from $Z$, square it, and to multiply it by $k$.


sample1 <- rnorm(n = 1e4)^2 + rnorm(n = 1e4)^2 + rnorm(n = 1e4)^2
sample2 <- 3 * rnorm(n = 1e4)^2

curve(dchisq(x, df = 3), from = 0, to = 40, col = "red", lwd = 2) lines(density(sample1), col = "blue") lines(density(sample2), col = "green")

enter image description here


from seaborn import displot
import numpy.random as dists
import pandas as pd

sample_size = 104 sample1 = dists.normal(size = sample_size)2 + dists.normal(size = sample_size)2 + dists.normal(size = sample_size)2 sample2 = 3 * dists.normal(size = sample_size)**2 sample3 = dists.chisquare(df = 3, size = sample_size)

plot_data = pd.concat([pd.DataFrame({'label': '3 independent chi_sq_1', 'data': sample1}), pd.DataFrame({'label': '3 times chi_sq_1', 'data': sample2}), pd.DataFrame({'label': 'chi_sq_3', 'data': sample3})], ignore_index = True)

displot(data = plot_data, x = 'data', hue = 'label') displot(data = plot_data, x = 'data', hue = 'label', kind = 'ecdf')

enter image description here enter image description here

Cong Chen
  • 223
ocram
  • 21,851