1

For a df that looks something like the following

group     signedup
A                  1
B                  1
A                  1
B                  1
B                  0
B                  0
A                  0

I need to calculate the difference in means between group A and B for the 'signedup' attribute . Not sure if my solution is correct. Any insights will be appreciated!

Some background information: 'group' indicates whether the user is assigned to the control group (A) or the treatment group (B). signedup' indicates whether the user signed up for the premium service or not with a 1 or 0, respectively

from scipy import stats as scs
df=pd.read_csv(filename)
df_A=df.loc[df['group'] == 'A', ['signedup']]
df_B=df.loc[df['group'] == 'B', ['signedup']]
t,p= scs.ttest_ind(df_A,df_B)
if p < 0.05:
  print('Difference in means is statistically significant')
  • Welcome to Cross Validated! Is this a statistics question disguised as a coding question? If you just want to know about your Python code, 1) at a glance, it looks right and 2) pure coding questions are off-topic here, so a thorough debug or code review is not for Cross Validated. // For reasons discussed here, the t-test is not ideal here. I am a fan of the G-test, though I do not know a canned Python function. The chi-squared test (similar) probably is in scipy. – Dave Jul 31 '21 at 18:56
  • Unrelated, is it common to import stats as scs like importing numpy as no and pandas as pd? I have not done that, though most of my statistics work is in R, while I use Python for my data wrangling. – Dave Jul 31 '21 at 18:58
  • Hi Dave, I was asking if the use of a T test is correct for difference in means. – freshman_2021 Jul 31 '21 at 18:59
  • and yes it is totally acceptable to import stats the way I have done – freshman_2021 Jul 31 '21 at 19:01
  • My linked answer addresses the issue of t-testing binary variables like you have. In summary, you can do better than the t-test. – Dave Jul 31 '21 at 19:06

1 Answers1

1

Here are three tests that are commonly used to compare binomial proportions, in appropriate circumstances.

Consider fictitious data sampled in R, based on sample sizes $n = 300$ and actual success rates $p_a = 0.6, p_b = 0.7.$ All three tests detect a difference with P-values about $0.001.$

set.seed(731)
n = 300
x.a = rbinom(n, 1, .60)
x.b = rbinom(n, 1, .70)

table(x.a) x.a 0 1 124 176 table(x.b) x.b 0 1 85 215

Welch t test: Appropriate for large $n.$

t.test(x.a, x.b)
    Welch Two Sample t-test

data: x.a and x.b t = -3.3677, df = 593.35, p-value = 0.0008071 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.20581319 -0.05418681 sample estimates: mean of x mean of y 0.5866667 0.7166667

A test of binomial proportions using a normal approximation, similar to a chi-squared test on a $2 \times 2$ table. [The continuity correction is slightly conservative and may not be needed for $n$ as large as $300.]$

yes = c(sum(x.a),sum(x.b));  yes
[1] 176 215
prop.test(yes, c(n,n))
    2-sample test for equality of proportions 
    with continuity correction

data: yes out of c(n, n) X-squared = 10.602, df = 1, p-value = 0.00113 alternative hypothesis: two.sided 95 percent confidence interval: -0.20886568 -0.05113432 sample estimates: prop 1 prop 2 0.5866667 0.7166667

Fisher's Exact Test, which uses a hypergeometric distribution based on row and column totals of a $2 \times 2$ table under the null hypothesis that the two groups have equal success probabilities: [It is especially useful if $n$ is small.]

TBL = cbind(yes, n-yes);  TBL
     yes    
[1,] 176 124
[2,] 215  85
fisher.test(TBL)
    Fisher's Exact Test for Count Data

data: TBL p-value = 0.001105 alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 0.3932415 0.7998073 sample estimates: odds ratio 0.5616766

A brief simulation in R shows that the Welch t test has power about 99% of detecting a difference between $p.a = 0.6$ and $p.b = 0.7$ with sample sizes $n=300.$ Similar simulations can find the power of the other two tests for appropriate sample sizes.

set.seed(2021)
n = 300; p.a = 0.5; p.b = 0.7
pv = replicate(10^5, 
               t.test(rbinom(n,1,p.a),rbinom(n,1,p.b))$p.val)
mean(pv <= 0.05)
[1] 0.99898  # approximate power

For $n = 150$ and the same proportions, the power is about 95%.

Note: Python must have equivalent procedures for such tests.

BruceET
  • 56,185