The answer would probably include repeating this post dealing with the resilience of the chi square test even when expected cell counts are below 5. So a chi square test may be the answer, and it could be approached as a goodness of fit (GOF), comparing the distribution of the frequency of "anxious" subjects to a uniform distribution across the 3 groups.
However, there will be probably a warning message when running the chi square test. I simulated the data set (the code is here, and also pasted below this paragraph for convenience. There were the same number of subjects as in the original post group a = 30, b =20 and c = 5:
subjects <- c(group_a <- rep("a", 30), group_b <- rep("b", 20), group_c <- rep("c", 5))
sam <- c(sample(c(c(rep("N",25)), c(rep("A",5)))),
sample(c(c(rep("N",10)), c(rep("A",10)))),
c(sample(c("A","N"), 5, replace = T)))
(tab = table(subjects,sam))
sam
subjects A N
a 5 25
b 10 10
c 3 2
The group a was set up as much less anxious: 25 N for "no anxiety".
The test output was:
(p_observed = chisq.test(tab, correct = F))
Pearson's Chi-squared test
data: tab
X-squared = 7.9142, df = 2,
$p-value = \color{orange}{0.01912}$
But there was a warning message:
Warning message:
In chisq.test(tab, correct = F) :
Chi-squared approximation may be incorrect
So I wanted to see if we could come up with an ad hoc permutation test, and shuffle the A, N labels across all subjects, under the null hypothesis of no differences in distribution of anxious subjects across groups, and then run a chi square test on the tabulated resulting frequencies. Disregarding thus any issues with the small size of group 3, and just paying attention to the relative p value of every iteration with respect to the others, I think it is fair to say that the proportion of permutations with a p value lower than the observed in the actual data is an exact p value - a simulated simulated Fisher test, or, probably more exactly, a permutation test.
Here's the code in R, and the results:
set.seed(0)
options(warn=0)
(tab = table(subjects,sam))
(p_observed = chisq.test(tab, correct = F))
chisq.test(tab, simulate.p.value = T)
pval <- c(NA, length(sam))
options(warn=-1)
for (i in 1:1e4){
anx <- sample(sam, length(sam), replace = F)
tab <- table(subjects,anx)
pval[i] <- chisq.test(tab, correct = F)$p.value
}
(p_value = mean(pval < p_observed$p.value))
The p_value = $\color{red}{0.0186}$ was lower than the initially calculated with chisq.test.
I was surprised that this p value was also lower than the calculation via Monte Carlo simulation within the R built-in function:
chisq.test(tab, simulate.p.value = T)
Pearson's Chi-squared test with simulated p-value (based
on 2000 replicates)
data: tab
X-squared = 7.9142, df = NA, p-value = 0.02099
After getting a significant result, pairwise comparisons can be performed with Bonferroni correction (level of significance $0.05 /\text{no.hypotheses} = 0.05 / 3 = \color{blue}{0.0167}$). These can be obtained directly with the R built-in fisher.test function:
Between group a and b:
fisher.test(tab[1:2,], alternative = "two.sided")
p-value = 0.02546
Between b and c:
fisher.test(tab[2:3,], alternative = "two.sided")
p-value = 1
And a and c:
fisher.test(tab[c(1,3),], alternative = "two.sided")
p-value = 0.06654
Oddly, none of the results is significant, because of the conservative nature of the Fisher test. If we were to run a chisq.test between a and b - and the size of the samples would clearly allow it - we would get a statistically significant result:
$p-value = \color{green}{0.01174}.$