7

I'm investigating the effect of a continuous variable A on a measurement variable M stratified by another factor variable C in an observational dataset.

Due to heteroscedasticity I decided to use a bootstrapped regression analysis. However looking at the data, the background set of variables are not evenly distributed if I dichotomise A (present or not). I've just finished running another analysis where I do the same analysis after having matched the dataset for confounders (using CEM in R).

Now the problem is which analysis to trust: the bootstrapped regression approach on the entire dataset or the bootstrapped version of the matched data? Under one of the factors in C the results diverge.

Any ideas how this can be analyzed?

Rob Hyndman
  • 56,782
Misha
  • 1,323

1 Answers1

6

There is a wrinkle that you need to worry about. In the case of matching, you are throwing away observations (i.e., those that aren't matched and don't make it into your analysis) and some might be replicated. These decisions aren't random; they are a function of covariates. As a result, creating confidence intervals in this context are a bit complicated. To calculate the appropriate standard errors, see Large Sample Properties of (Matching Estimators for Average Treatment Effects by Abadie and Imbens. Additionally, Abadie also has a paper on On the Failure of the Bootstrap for Matching Estimators. To implement Abadie-Imbens standard errors, see the Matching package in R by Jas Sekhon.

On the question of which estimator to believe, it depends on how well you think that matching is correctly controlling for confounders, I'd be inclined to believe that approach. It seems that your first set of analyses doesn't control for those factors in any way? If you think that they are important, then you probably wouldn't be inclined to believe those results.

Charlie
  • 14,062
  • 5
  • 44
  • 72
  • I´ll look into the matching package. My first analysis does control for the same factors as implemented in the match procedure in a multivariate linear regression model. Thx for the feedback.. – Misha Sep 26 '10 at 11:50