This seems like it should be a really simple thing, so I apologize if this has been answered elsewhere, but I've browsed dozens of existing questions and none seem to fit exactly.
Suppose I have two cohort groups, control and target. I apply a treatment to the target, and want to use DiD to compare conversion deltas between the two groups; pre and post treatment.
I believe the following r code correctly proves that in our instance, the effect is truly non-zero (i.e. we reject the NULL hypothesis). However, because it's a binomial GLM, the confidence intervals are bounding the probability that it's non-zero, not the actual effect itself:
Converted <- c(2526,2500,2818,2268)
Offered <- c(3992,3956,4484,3996)
Cohort <- factor(c("Target","Target","Control","Control"))
Period <- factor(c(0,1,0,1))
ConvRate <- Converted / Offered
fit <- glm(ConvRate~Cohort*Period, family=binomial, weights=Offered)
summary(fit)
confint(fit)
(Summarized) Output:
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.52560 0.03090 17.007 < 2e-16 ***
CohortTarget 0.01850 0.04509 0.410 0.681633
Period1 -0.25367 0.04444 -5.708 1.14e-08 ***
CohortTarget:Period1 0.25017 0.06434 3.888 0.000101 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 5.1064e+01 on 3 degrees of freedom
Residual deviance: -7.9270e-13 on 0 degrees of freedom
AIC: 42.851
Number of Fisher Scoring iterations: 2
Waiting for profiling to be done...
2.5 % 97.5 %
(Intercept) 0.4651823 0.5863361
CohortTarget -0.0698583 0.1069002
Period1 -0.3407991 -0.1665982
CohortTarget:Period1 0.1240757 0.3762874
In order to get the actual effect size, I ran this through a LM, but it's refusing to provide me with confidence intervals;
lfit <- lm(ConvRate~Cohort*Period, weights=Offered)
summary(lfit)
confint(lfit)
Output:
Call:
lm(formula = ConvRate ~ Cohort * Period, weights = Offered)
Weighted Residuals:
ALL 4 residuals are 0: no residual degrees of freedom!
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.628457 NA NA NA
CohortTarget 0.004309 NA NA NA
Period1 -0.060889 NA NA NA
CohortTarget:Period1 0.060075 NA NA NA
Residual standard error: NaN on 0 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: NaN
F-statistic: NaN on 3 and 0 DF, p-value: NA
2.5 % 97.5 %
(Intercept) NaN NaN
CohortTarget NaN NaN
Period1 NaN NaN
CohortTarget:Period1 NaN NaN
Warning message:
In qt(a, object$df.residual) : NaNs produced
A similar issue results with family=gaussian when using GLM again:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.628457 NA NA NA
CohortTarget 0.004309 NA NA NA
Period1 -0.060889 NA NA NA
CohortTarget:Period1 0.060075 NA NA NA
(Dispersion parameter for gaussian family taken to be NaN)
Null deviance: 1.2194e+01 on 3 degrees of freedom
Residual deviance: 2.0249e-28 on 0 degrees of freedom
AIC: -272.54
Number of Fisher Scoring iterations: 1
Waiting for profiling to be done...
Error in while ((step <- step + 1) < maxsteps && abs(z) < zmax) { :
missing value where TRUE/FALSE needed
Calls: confint -> confint.glm -> profile -> profile.glm
In addition: Warning message:
In qf(1 - alpha, 1, n - p) : NaNs produced
Execution halted
I'm wondering how I can get the effect size confidence intervals? The difference in differences is ~6% but I'm looking for statistically significant intervals.