I’m trying to informally derive the chi-squared test statistic using a combination of basic geometry and algebra. I’m successfully able to obtain a system of equations that contain Karl Pearson’s chi-squared test statistic. But I need help showing that the test statistic = chi^2 from my equations.
My approach:
I have a 3-sided die.
I roll the die a number of times and record the frequency of each face.
This system has 2 degrees of freedom (we only need to know the frequencies of any 2 faces to infer the frequency of the remaining one).
Therefore, we can describe the distance, chi, between the observed and expected values as a formula with 2 dimensions via the Pythagorean theorem: $$ \chi^2 \quad=\quad Z^2 \quad+\quad Z_{prime}^2 $$ ...where Z is the (standardized) difference between the observed and expected values for any one face, and Z_prime is the remaining side of our triangle in 2D space (Z_prime also implies a transformation of the distribution of the 2nd face from a joint distribution into an independent distribution, making the combined distribution circular).
Note that: $$ Z^2 \quad=\quad p.Z^2\quad+\quad(1-p).Z^2 $$
...and similarly: $$ Z_{prime}^2 \quad=\quad p.Z_{prime}^2\quad+\quad(1-p).Z_{prime}^2 $$
...therefore: $$ \chi^2 \quad= p.Z^2\quad+\quad(1-p).Z^2 + \quad p.Z_{prime}^2\quad+\quad(1-p).Z_{prime}^2 $$ So for all 3 faces (A, B, C) we have the following system of equations: $$ \chi^2 \quad= p_{A}.Z_{A}^2\quad+\quad(1-p_{A}).Z_{A}^2 + \quad p_{A}.Z_{A.prime}^2\quad+\quad(1-p_{A}).Z_{A.prime}^2 \\ \chi^2 \quad= p_{B}.Z_{B}^2\quad+\quad(1-p_{B}).Z_{B}^2 + \quad p_{B}.Z_{B.prime}^2\quad+\quad(1-p_{B}).Z_{B.prime}^2 \\ \chi^2 \quad= p_{C}.Z_{C}^2\quad+\quad(1-p_{C}).Z_{C}^2 + \quad p_{C}.Z_{C.prime}^2\quad+\quad(1-p_{C}).Z_{C.prime}^2 \\ $$
[Equations 1-3]
Now, since: $$ Z = \frac{(O-E)}{\sigma} $$
…and: $$ Z^2 = \frac{(O-E)^2}{np(1-p)} $$ …then, if we multiply Z^2 by (1-p) we get: $$ (1-p).\frac{(O-E)^2}{np(1-p)} = \frac{(O-E)^2}{np} $$ Therefore, the sum of the 2nd column from Equations 1-3 is identical to Pearson's chi-square test statistic for 2 degrees of freedom, i.e.: $$ (1-p_{A}).Z_{A}^2\quad +\quad (1-p_{B}).Z_{B}^2 \quad+ \quad(1-p_{C}).Z_{C}^2 \\= \frac{(O_{A}-E_{A})^2}{E_{A}}\quad+\quad\frac{(O_{B}-E_{B})^2}{E_{B}}\quad+\quad\frac{(O_{C}-E_{C})^2}{E_{C}} $$
My question is: how can I demonstrate from Equations 1-3 that: $$ \chi^2\quad=\quad(1-p_{A}).Z_{A}^2\quad +\quad (1-p_{B}).Z_{B}^2 \quad+ \quad(1-p_{C}).Z_{C}^2 $$ [Equation 4]
By the way, it's already straightforward to demonstrate that: $$ \chi^2 = \\ p_{A}.Z_{A}^2\quad+\quad p_{A}.Z_{A.prime}^2 + \\ p_{B}.Z_{B}^2\quad+\quad p_{B}.Z_{B.prime}^2 + \\ p_{C}.Z_{C}^2\quad+\quad p_{C}.Z_{C.prime}^2 \\ $$ ...since the sum of probabilities = 1.
Similarly, we can show that: $$ 2*\chi^2 = \\ (1-p_{A}).Z_{A}^2\quad+\quad (1-p_{A}).Z_{A.prime}^2 + \\ (1-p_{B}).Z_{B}^2\quad+\quad (1-p_{B}).Z_{B.prime}^2 + \\ (1-p_{C}).Z_{C}^2\quad+\quad (1-p_{C}).Z_{C.prime}^2 \\ $$ ...since the sum of (1-probabilities) = 2.
But I'm not sure that these identities help me arrive at Equation 4.