There are several nearly equivalent tests to compare
two binomial proportions. The parameter of interest is
the difference between the two population proportions
$p_1 - p_2,$ which is usually estimated by the difference
between the corresponding two sample proportions
$\hat p_1 - \hat p_2,$ where $\hat p_i = x_i/n_i, i=1,2.$ with numbers $x_i$ of successes in $n_i$ trials.
Differences among the tests center on how or whether
to use a normal approximation and on how to estimate
the standard deviation of $\hat p_1 - \hat p_2,$ often
called the (estimated) standard error. Roughly speaking,
One method is to assume equality of $p_1$ and $p_2,$ estimating $p = p_1 = p_2$ as $\hat p = \frac{x_1+x_2}{n_1+n_2}$ and $\widehat{\mathrm{Var}}(\hat p) = (n_1+n_2)\hat p(1-\hat p).$
An alternative method is to estimate the variances of the $\hat p_i$ separately and add.
Moreover, especially for small $n_i,$ various tests use
different continuity corrections when invoking normal approximations, and other tests use no continuity correction.
Fortunately, these variations in method often make very
little difference in final results. So it is more important
to remember that the variations exist (so as not to be puzzles when various analyses do not match exactly) than to worry
about which to use.
In R, the procedure prop.test uses a test statistic with an approximate chi-squared distribution. Suppose that there are $n_1 = 100$ subjects in the A group with $x_1 = 83$ successes
and, independently $n_2 = 88$ subjects in the B group with
$x_2 = 92$ successes, so that $\hat p_1 = 0.83, \hat p_2 \approx 0.9565.$ These two sample proportions differ
significantly at the $1\%$ level because the P-value
of the test is smaller than $0.01.$
prop.test(c(83,88), c(100,92), cor=F)
2-sample test for equality of proportions
without continuity correction
data: c(83, 88) out of c(100, 92)
X-squared = 7.8742, df = 1, p-value = 0.005015
alternative hypothesis: two.sided
95 percent confidence interval:
-0.21111962 -0.04192386
sample estimates:
prop 1 prop 2
0.8300000 0.9565217
Notice that this test is the same as a chi-squared test
of homogeneity on the $2 \times 2$ table there columns
are for A and B, rows are for Success and Failure.
(Particularly with sample sizes around 100 or larger,
I choose to use the argument cor=F to suppress the
continuity correction.)
TAB = rbind(c(83,88), c(17,4)); TAB
chisq.test(TAB, cor=F)
[,1] [,2]
[1,] 83 88
[2,] 17 4
Pearson's Chi-squared test
data: TAB
X-squared = 7.8742, df = 1, p-value = 0.005015
The P-value is exactly the same as for a two-tailed test prop.test(as above), but no confidence interval or
estimates $\hat p_i$ are given.
Notes: (1) If some of the counts in TAB are very small (thus, triggering a warning message), it is
best to use chisq.test with parameter sim=T to get
a simulated P-value that may be more useful than the one from the
traditional chi-squared test statistic.
(2) Several other Answers on this site discuss
tests of binomial proportions. You may find additional example and discussions alternative tests there. Also, uses of alternative tests can be
found online.