2

Here is some data used for a chi square test:

       High    Low
B       62     10
Non-B   1366   97

chisq.test(VolumeData)

Pearson's Chi-squared test with Yates' continuity correction

data: VolumeData X-squared = 4.5124, df = 1, p-value = 0.03365

This may be trivial, but I got a statistically significant output. This does mean that "B" and "High" are more closely related than other relationships, correct?

The goal was to answer: Are those who visit "High" more likely to be "B" than those who visit "Low"?

How can I find out if the results are meaningful since statistical significance doesn't prove that?

Yule's Q calculation gave me a moderate association. Does this prove that the result is meaningful?

psych::Yule(VolumeData)

-0.3886347 # moderate

1 Answers1

2

I will go through your questions one-by-one.

  1. This may be trivial, but I got a statistically significant output. This does mean that "B" and "Low" are closer related than other relationships, correct?

Without knowing exactly what these variables are supposed to mean, we can see that there is something definitely strange going on with the Low Group x B Group...there are a substantial number of cases in this group pair (over 1000) compared to the others (less than 100). The chi-square test of a 2x2 contingency table tests the null hypothesis that there is no relation between one factor and another. Clearly the high number in the Low B group allows you to reject the null hypothesis, as there seems to be something going on that implies a relationship between Low Group and B Group (and maybe even in the High x Non-B group, see edit to this answer below).

  1. How can I find out if the results are meaningful since statistical significance doesn't prove that.

You're right about statistical significance. It doesn't "prove" anything, it only allows you to posit that the probability of finding a result this extreme is within a given threshold. Your Yule coefficient to a degree helps answer whether or not the magnitude of the effect is meaningful or not.

  1. Yule's Q calculation gave me moderate association. Does this prove that the result is meaningful?

Generally speaking, statistics is shy about using the word "proof." This result seems to indicate that there does seem to be some moderate association between these two factors, but that doesn't necessarily prove anything. As I've mentioned in another answer to a similar question, there are a number of other confounds or theoretically meaningful reasons for why this effect is appearing that are likely unexplored. You can say with some confidence that there is an effect, that effect is moderate, and that further research may elucidate why this effect is apparent. I would just generally be cautious with the word "prove" in cases like these.

Edit

As JJJ pointed out, it also appears that your "Non-B High" group also has very few cases (only 10), which I didn't immediately notice when I saw the question. So the interpretation may not be as straightforward, as the chi-square test only checks to see if there is any equality and not specific quantities of each group. You would likely need to add that as part of your interpretation: that both groups may be causing this result, or even that all of the groups have some conditional relationship you are not aware of. He provided an excellent example in the comments that illustrates how complicated this interpretation can get. The best thing I can suggest is having a theoretically driven explanation as to why this occurred.

  • 2
    I don't think it's the "high number in the Low B group [that] allows you to reject the null hypothesis", as you can also say that's the low number in "Non-B High" (or any number in any cell) that allows you to do that. Related answer: https://stats.stackexchange.com/a/222276/164936 – J-J-J Dec 13 '22 at 09:01
  • 1
    Interesting, I didn't even notice how low Non-B high was, but I think you're correct. I've amended my answer. – Shawn Hemelstrand Dec 13 '22 at 09:26
  • 1
    I think a problem is that the question is not entirely clear. Obviously there's more probability to be in the "Low B" group in general. On the other hand, if the underlying question is more subtle, there are other things that can be useful to look at, like the standardized residuals or the odds ratio (e.g. if the question is "By how much are you more likely to be in the Low group if you are B, compared to someone in the non-B group?"). I think it would be helpful to know what is the ultimate goal of the analysis. – J-J-J Dec 13 '22 at 09:59
  • I agree. Some context would be helpful. – Shawn Hemelstrand Dec 13 '22 at 11:03