I'm having a little trouble interpreting the data after a PSM match. I used full matching with a logit link, no caliper. I am testing my treatment "buyout" 's effect on the binary outcome "dest_flood". What I want to report is a z-test for the adjusted sample with inverse probability weights. Using this guide from the MatchIt vignette, I prepared my code this way.
modelA <- glm(dest_flood ~ buyout_flag, data = full_data, weights = weights, family = quasibinomial(link = "logit"))
round(coeftest(modelA, vcov. = vcovCL, cluster = ~subclass)[1:2,], digits = 3)
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.214 0.285 -7.75 0.000
buyout_flag -0.926 0.437 -2.12 0.034
sandA <- coeftest(modelA, vcov. = vcovCL, cluster = ~subclass)
sandA
z test of coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.214 0.285 -7.75 0.0000000000000089 ***
buyout_flag -0.926 0.437 -2.12 0.034 *
Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
exp(coef(modelA)) #OR
(Intercept) buyout_flag
0.109 0.396
It seems like the step most of the way through when I print "sandA" output and the header is z test of coefficients should be it. But when I compare it to a prop test of the same values, they are so different that I'm not sure how I should interpret them:
table(full_data$dest_flood, full_data$buyout_flag)
0 1
0 852 254
1 80 11
prop.test(x=c(11, 80), n = c(254 , 852), p = NULL, alternative = "two.sided",
correct = TRUE)
> prop.test(x=c(11, 80), n = c(265 , 932), p = NULL, alternative = "two.sided",
correct = TRUE)
2-sample test for equality of proportions with continuity correction
data: c(11, 80) out of c(265, 932)
X-squared = 5.1579, df = 1, p-value = 0.02314
alternative hypothesis: two.sided
95 percent confidence interval:
-0.07675366 -0.01190129
sample estimates:
prop 1 prop 2
0.04150943 0.08583691
Comparatively, if I use the same process (more or less) for continuous variables I get, the weighted etc post-PSM coeftest matches up fairly well with a basic t test:
model1 <- lm(percent_poverty_dest ~ buyout_flag, data = full_data, weights = weights)
round(coeftest(model1, vcov. = vcovCL, cluster = ~subclass)[1:2,], digits = 3)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 12.65 0.876 14.444 0.000
buyout_flag -0.79 1.126 -0.702 0.483
sand1 <- coeftest(model1, vcov. = vcovCL, cluster = ~subclass)
sand1
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 12.649 0.876 14.4 <0.0000000000000002 ***
buyout_flag -0.790 1.126 -0.7 0.48
Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
buyout <- subset(full_data, buyout_flag == 1)
prox <- subset(full_data, buyout_flag == 0)
t.test(buyout$percent_poverty_dest , prox$percent_poverty_dest )
Welch Two Sample t-test
data: buyout$percent_poverty_dest and prox$percent_poverty_dest
t = -3, df = 456, p-value = 0.01
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.016 -0.392
sample estimates:
mean of x mean of y
11.9 13.6
How should I interpret this? And can I get z-test results for binary outcomes after a PSM match?
link = "log"in theglm()call. For model1, I don't know what the outcome is so I can't recommend a proper interpretation, but -.79 does represent a difference in means. – Noah Jun 23 '22 at 14:19