For inverse probability weighting (IPW) in R the use of survey::svyglm is well established. I want to compare the results of 10K+ (!) models with and without the use of IPW-weights (i.e. sampling weights aka 'pweights' in Stata).
From Lumley (2010) we learn, that the HC0 standard errors "are almost identical to the design-based standard errors of [survey::svyglm]; the main difference being in the handling of stratification".
However, when comparing two models without weights and without strata, the results are still only "almost identical".
fo <- Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width + Species
dsg <- survey::svydesign(ids=~1, weights=~1, data=iris)
summary(survey::svyglm(fo, design=dsg))$coe
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.1712663 0.24279816 8.942680 1.692962e-15
Sepal.Width 0.4958889 0.07861740 6.307623 3.278723e-09
Petal.Length 0.8292439 0.07081753 11.709585 1.141180e-22
Petal.Width -0.3151552 0.16137075 -1.952988 5.276000e-02
Speciesversicolor -0.7235620 0.23345593 -3.099351 2.333076e-03
Speciesvirginica -1.0234978 0.34165977 -2.995664 3.225249e-03
fit <- lm(fo, iris)
cbind(lmtest::coeftest(fit, vcov.=sandwich::vcovHC(fit, type='HC0')))
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.1712663 0.24198747 8.972639 1.422017e-15
Sepal.Width 0.4958889 0.07835490 6.328754 2.946009e-09
Petal.Length 0.8292439 0.07058108 11.748813 9.004061e-23
Petal.Width -0.3151552 0.16083194 -1.959531 5.198181e-02
Speciesversicolor -0.7235620 0.23267644 -3.109735 2.257633e-03
Speciesvirginica -1.0234978 0.34051900 -3.005700 3.126834e-03
In my project, the survey::svyglm approach without IPW-weights now yields 7.6% significant BH-adjusted p-values of the treatment variable, whereas the lm/"HC0" approach yields 6.2%.
Consequently the results differ considerably depending on the approach.
I consider to report the survey::svyglm values, since the only difference between IPW and naïve model is the use of IPW-weights.
However, if I had not intended to compare IPW with naive models, I would not have used survey::svyglm and thus would have arrived at different results.
So my question is, how exactly are these "design-based standard errors" calculated when there are neither weights nor stratification in the model? Are they also valid/better without weights and strata? How do they differ from the HC0-standard errors?
svyglm()is doing and explains the sandwich estimator it uses for standard errors: https://www.practicalsignificance.com/posts/survey-regression-what-we-estimate/ This paper by Dr. Lumley (the author of the 'survey' package) provides a much more detailed explanation: https://projecteuclid.org/journals/statistical-science/volume-32/issue-2/Fitting-Regression-Models-to-Survey-Data/10.1214/16-STS605.pdf – bschneidr Mar 06 '23 at 21:56