0

First, I understand that including covariates in the outcome model after matching is optional based on reading the matchit vignette and similar question here. However, I'm a bit confused on what is the recommended/correct way of specifying matching covariates when estimating effects after matching with exact matching.

Say I am performing the following matching where I specify exact matching on two variables:

matchExample <- matchit( formula(treatment ~ 
                  matchVar1 + matchVar2 + matchVar3,
                  env = data),
  data = data,
  method = "full",
  exact = ~ indexClass + indexRegion,
  distance = "glm",
  link = "logit",
  estimand = "ATT"
)

matchData <- match.data(matchExample )

What is the correct way to specify the matching variables and exact matching variables for estimating effects?

For example, in the "Moderation Analysis" section of the MatchIt vignette "Estimating Effects After Matching", it suggests to only include the exact matching variables in the lm() and avg_comparisons() calls like this:

# fit model for the outcome given the treatment based 
# on moderation analysis suggestion
mod1 <- lm(response ~ treatment * (indexClass + 
           indexRegion),
                 data = matchData,
                 weights = weights)

Calculate average comparisons

avgComp1 <- avg_comparisons(mod1 , variables = "treatment", vcov = ~ subclass, newdata = subset(matchData, treatment == 1), wts = "weights", by = c("indexClass", "indexRegion"))

However, in this post, the matching variables are used as covariates in the lm() call and not the exact matching variables. Providing an example in the same context used here:

# fit model for the outcome given the treatment based 
# on other post suggestion
mod2 <- lm(response ~ treatment * (matchVar1 + 
           matchVar2 + matchVar3),
           data = matchData,
           weights = weights)

Calculate average comparisons

avgComp2 <- avg_comparisons(mod2 , variables = "treatment", vcov = ~subclass, newdata = subset(matchData, treatment == 1), wts = "weights", by = c("indexClass", "indexRegion"))

I would like to know if when specifying the "by =" argument to signify that we want treatment effects stratified by the exact matching variables, should the exact matching variables be included as covariates as in mod1, or should the matching variables be included as covariates as in mod2?

0 Answers0