Can I use fisher test on a $1 \times n$ table for comparing multiple proportions?

Question

Let $X$ be $1$ (success) if some event occurs and $0$ other wise, let $p_i$ be the proportion of successes in the population $i$, so I'd like to use fisher exact test to test:

$H_0: p_i=p_j$ for all $i \neq j$ with $i,j \in \{1,...,n\}$

$H_1:$ There is at least one difference.

Fisher exact test tests if there is an association between two categorical variables that is it tests if all counting elements of a 2x2 contigency table are the same, in upper side of the table we have the first categoric variable and its levels and in left side we have the other. On my way I modified the table to be $1 \times n$ (row $\times$ column) sized where the upper side would be the population index and each element in the table would be the counting of successes in each sample (with known number of failures) from each population. Finally I'd like to know if I'm able to use Fisher exact test in this situation to test if the proportion of successes are the same or am I just able to use fisher test with two categoric variables on a 2x2 contigency table?

I don’t see why you have one column instead of two. Could you please explain that? — Dave, Dec 24 '21 at 02:56
one row, this one row has $n$ countings of success for each sample from the respective population. — Davi Américo, Dec 24 '21 at 03:13
Do you know how many failures (or total attempts) there were? — Dave, Dec 24 '21 at 03:41
So why don’t you have a column with the number of successes and another with the number of failures? — Dave, Dec 24 '21 at 05:11
for it looks like a contingency table regarding the traditional use of fisher test — Davi Américo, Dec 24 '21 at 05:57

BruceET · Accepted Answer · 2021-12-24T06:35:02.770

Several procedures in R give much the same result.

Suppose Yes's and No's in four categories are as follows.

Yes = c(51,74, 22, 2)
No =  c(57, 99, 55, 5)
All = Yes + No
TBL = rbind(Yes, No)
TBL
    [,1] [,2] [,3] [,4]
Yes   51   74   22    2
No    57   99   55    5

Test to see if proportions of Yes's are the same in all four categories:

prop.test(Yes, All, cor=F)
    4-sample test for equality of proportions 
    without continuity correction


data:  Yes out of All
X-squared = 7.3227, df = 3, p-value = 0.06229
alternative hypothesis: two.sided
sample estimates:
   prop 1    prop 2    prop 3    prop 4 
0.4722222 0.4277457 0.2857143 0.2857143
Warning message:
In prop.test(Yes, All, cor = F) :
  Chi-squared approximation may be incorrect

The warning message is given because of the small counts in Category 4. (Essentially, this test uses a normal approximation, expressed in terms of a chi-squared statistic with 3 DF.)

The prop.test procedure in R is essentially the same as a chi-squared test of homogeneity on TBL, without the Yates continuity correction.

chisq.test(TBL, cor=F)
Pearson's Chi-squared test


data:  TBL
X-squared = 7.3227, df = 3, p-value = 0.06229
Warning message:
In chisq.test(TBL) : 
  Chi-squared approximation may be incorrect

In this version of the test, it is easier to show explicitly why the P-value may be incorrect. The warning message is given whenever any one of the expected counts in the chi-squared test is smaller than $5.$ Here is how to display the table of expected counts. Notice that the P-value is exactly the same as above.

chisq.test(TBL, cor=F)$exp
        [,1]      [,2]     [,3]     [,4]
Yes 44.08767  70.62192 31.43288 2.857534
No  63.91233 102.37808 45.56712 4.142466
Warning message:
In chisq.test(TBL, cor=F) : 
 Chi-squared approximation may be incorrect

However, as implemented in R, it is possible to simulate a more accurate P-value (using the parameter sim=T). Notice the (slight) change in the simulated P-value.

chisq.test(TBL, sim=T)
    Pearson's Chi-squared test 
    with simulated p-value 
    (based on 2000 replicates)


data:  TBL
X-squared = 7.3227, df = NA, p-value = 0.06347

Traditionally, Fisher's exact test (based on a hypergeometric distribution according to marginal counts) was limited to $2 \times 2$ tables with relatively small counts. However, its implementation in R can use larger tables (within limits of available computer memory to do the computations). The table TBL is as suggested in one of @Dave's Comments.

fisher.test(TBL)
    Fisher's Exact Test for Count Data


data:  TBL
p-value = 0.05681
alternative hypothesis: two.sided

Note: For the fictitious data used here, all tests are significant at the 7% level, but not at the 5% level. Often simulated P-value of the chi-squared test of homogeneity and the P-value of Fisher's exact test are more different from one another than for these data.

This example states a $2 \times 4$ table not a $1 \times n$ if I have only "Yes" and the other four ($n$) levels of any category , Could I still use this, if so, I'll consider your answer. — Davi Américo, Dec 24 '21 at 08:35

Can I use fisher test on a $1 \times n$ table for comparing multiple proportions?

1 Answers1