0

I have a 3-sided (fair) die with sides x1, x2, x3. I roll the die 3000 times and tally the results: x1 * 990, x2 * 1050, x3 * 960. A quick chi-square test reveals that the probability of getting this result is ~12%:

chisq.test(x = c(990,1050,960), p = c(1,1,1), rescale.p = TRUE)$p.value
# 0.122456428252982

Now, I want to reproduce this p-value via simulation. Here are my steps:

  1. Create simulation:
library(tidyverse)
set.seed(123)

x <- replicate(10000, sample(1:3, size=3000, replace = TRUE)) df <- data.frame(x1=colSums(x==1, na.rm=TRUE), x2=colSums(x==2, na.rm=TRUE), x3=colSums(x==3, na.rm=TRUE)) df %>% head(3)

x1 x2 x3

<dbl> <dbl> <dbl>

#1 979 978 1043 #2 974 1042 984 #3 1017 966 1017

  1. Find the probability that the tally for x1 is as extreme or more as 990:
p_990 = df %>% summarise(count=sum(abs(1000-x1)>=10) / 10000) %>% pull()
p_990

0.711

  1. Find the probability that the tally for x2 is as extreme or more as 1050:
p_1050 = df %>% summarise(count=sum(abs(1000-x2)>=50)/ 10000) %>% pull()
p_1050

0.0596

  1. Take the product of both probabilities:
p_990 * p_1050

0.0423756

This is not the ~12% I was expecting to obtain - where am I going wrong?

Rez99
  • 233
  • 1
    Your simulation isn't doing a chi-squared test. It multiplies two unrelated probabilities--and that product isn't terribly meaningful, either, because the events are not independent. – whuber May 02 '22 at 19:18
  • Can you point to any resource that shows how to simulate a chi-square test? I've trawled through Google, Stack Exchange, Medium, etc. and can't find any. Ideally, I can get to the point where I can post an answer to my own question. – Rez99 May 02 '22 at 20:37
  • 1
    What aspect do you wish to simulate? We have many examples here on CV, for instance, where data are simulated and a chi-squared test is applied. In fact, in R you can simulate the null distribution just by specifying the simulate.p.value argument of chisq.test. The basic issue with your example is that it doesn't even compute the chi-squared statistic, so perhaps looking that up would be the place to start. – whuber May 02 '22 at 20:39
  • I'm trying to reverse engineer how / why a chi square test works. Most resources start with the formula for the test statistic and then explain how to turn the handle to compute the p-value. That's fine, but it doesn't help me to intuitively understand why the test statistic is what is in the first place (+ I don't have a background in statistics). Simulation has really helped me to understand why the analytical methods work. I can replicate a t-test just by simulating a sampling distribution. Similarly, here, I want to reproduce the p-value of a chi-square test w/o running a chi-square test. – Rez99 May 02 '22 at 20:53
  • 1
    One difficulty is that there are many different tests going by the name of "chi-square." At https://stats.stackexchange.com/a/17148/919 I describe one general such family, but there are many others. You might find some of our posts identified by chi-squared intuition to be helpful. – whuber May 02 '22 at 20:57
  • Thank you, this was helpful. I posted an answer below and a follow-up question here – Rez99 May 02 '22 at 23:42
  • @whuber I know that for a 3-sided die the chi-squared statistic follows a chi-squared distribution with 2 degrees of freedom. I also know that this distribution is equivalent to the sum of squares of 2 standard normal distributions. I was trying to replicate these distributions in this post -> what are these 2 distributions measuring exactly for a 3-sided die? – Rez99 May 03 '22 at 00:23
  • 1
    It is not the case that for a goodness of fit test (including in your case) that "the chi-squared statistic follows a chi-squared distribution". The distribution of the test statistic is (necessarily, since it's a function of counts) discrete; rather it is the case that the chi-squared distribution is an asymptotic approximation based on approximating the multinomial distribution of the counts by a (degenerate) normal with the same mean vector and variance-covariance matrix. It's generally quite a reasonable approximation in large samples. – Glen_b May 03 '22 at 02:21
  • Consequently if you simulate the null distribution of the test statistic you should see something close to (but not exactly equal to) an exponential with mean 2 as long as the sample size is large. – Glen_b May 03 '22 at 02:23
  • Thanks for clarifying @Glen_b, I think I follow what you mean about an approximation (since the distribution of the test statistic is not continuous). Did the part of my question about the nature of the 2* standard normal distributions make sense? Idk what these distributions are measuring, so I can’t replicate. I’d love to be able to replicate, take the sum of squares and generate the chi square distribution that way. But idk if that’s possible? – Rez99 May 03 '22 at 02:45
  • e.g. n=90; css=replicate(10000,chisq.test(table(sample(3,n,replace=TRUE)))$statistic); plot(ecdf(css),do.points=FALSE,verticals=TRUE); f=function(x) pchisq(x,2); curve(f,0,15,col=2,add=TRUE) – Glen_b May 03 '22 at 04:47
  • try n=12, n=30 and n-300 for comparison. The warnings at n=12 will be because the expected counts are on the low side – Glen_b May 03 '22 at 04:53
  • Another thing to try is to calculate actual $\alpha$ (actual type I error rate) at some nominal significance level: mean(css>qchisq(.95,2)). For a 5% test it's pretty poor at n=15 (no warnings though!), not too bad at n=30. Calculate the standard error of the estimate along with it though. – Glen_b May 03 '22 at 04:58
  • n=26 to 28 might give you some pause, even though they all have expected counts close to 9 – Glen_b May 03 '22 at 05:07

1 Answers1

1

Partial answer:

The code below does the following:

  1. Determine the test statistic for the given tallies of 990, 1050, 960. The result is 4.2.
  2. Determine the test statistic for each row of tallies in the simulation of random tallies.
  3. Compute the average number of rows where the test statistic is greater or equal to 4.2.

The final step gives us (an estimate of) the probability of getting a set of tallies whose distance from the expected distribution is at least as extreme as the target set of tallies (990, 1050, 960).

ref.test.stat <- chisq.test(x = c(990,1050,960), p = c(1,1,1), rescale.p = TRUE)$statistic

df %>% mutate(test.stat. = (1000-x1)^2/1000 + (1000-x2)^2/1000 + (1000-x3)^2/1000) %>% mutate(ind = ifelse(test.stat.>=ref.test.stat,1,0)) %>% summarise(mean(ind)) %>% pull()

0.1246

Rez99
  • 233