Suppose we have 800 school-age subjects in Group A, of whom
683 are enrolled in school. And in Group B suppose corresponding
numbers are 1001 out of 1100.
Test of proportions. Then a test of proportions,
prob.test This test uses the normal distribution to approximate
normal probabilities, but sample sizes are large enough for
good approximations. in R, gives the following results:
prop.test(c(683,1001), c(800,1100))
2-sample test for equality of proportions
with continuity correction
data: c(683, 1001) out of c(800, 1100)
X-squared = 13.991, df = 1, p-value = 0.0001837
alternative hypothesis: two.sided
95 percent confidence interval:
-0.08708815 -0.02541185
sample estimates:
prop 1 prop 2
0.85375 0.91000
It is not clear whether the continuity correction is appropriate
for such large samples. Here is the (very slightly smaller)
P-value without the correction.
prop.test(c(683,1001), c(800,1100), cor=F)$p.val
[1] 0.00013692
Using a two-sample t test. If we use a t test on the same data, then we are
comparing samples x and y: In x for Group A
we have 683 $1$'s and the rest of the observations
are $0$'s. In y, we have 1001 $1$'s and the
rest are $0$'s. The t test is only approximate because
data are not normal, but the sample size is large enough
for the P-value to be a good approximation.
The P-value is about 0.0002 or 0.0001 either way, so there is strong evidence
that the groups differ.
x = rep(1:0, c(683, 800-683))
y = rep(1:0, c(1001,1100-1001))
table(x)
x
0 1
117 683
table(y)
y
0 1
99 1001
t.test(x,y)
Welch Two Sample t-test
data: x and y
t = -3.7026, df = 1495.5, p-value = 0.0002211
alternative hypothesis:
true difference in means is not equal to 0
95 percent confidence interval:
-0.08604969 -0.02645031
sample estimates:
mean of x mean of y
0.85375 0.91000
The P-value is about 0.0002, leading to the same
interpretation as for the test of proportions.
I would do the test of proportions, but would not want to
disparage use of the t test.