Does estimating continuous treatment effects make sense in this case?

Question

I was reading the tutorial for twangContinuous here.

I loaded their dataset dat and found that there are two treatment groups A and B and there are different levels of tss_0 which is their continuous exposure and represents a count of trauma symptoms. Both treatment groups include 0<=tss_0<=13.

     0  0.1  0.2 0.88    1 1.16    2 2.31 2.33    3 3.45    4    5    6    7    8    9   10   11   12   13
A 1238    0    3    1   49    1   81    0    1   69    0   77   82   79   65   66   56   59   26   34   13
B 1307    1    0    0   33    0   68    2    0   82    1   81  101   91   64   53   43   39   16   10    8

If I want to define a treatment group as mothers with years of schooling>=12, does it make sense to estimate a continuous treatment effect using this package? I estimated all the teffects estimator (RA, IPW, IPWRA,AIPW, PSM, NNM) using stata teffects package where the treatment variable is binary so, I defined treatment=1 if mom's education>=12, otherwise treatment=0. At first, I thought I could stick with the same treatment definition in the continuous treatment effect estimator but only considering mother's education at various levels like 12 years, 13 years, 14 years, 15 years, and so on. It's like comparing 'no drug' to different doses of the drug, or in this case, no high school graduation to various levels of mom's education, starting at 12 years.

However, upon examining the dataset associated with the twangContinuous package, I'm questioning whether my case is suitable for a continuous treatment effect estimator. Could someone please confirm if estimating continuous treatment effects makes sense for me or not? Any other related suggestions would be greatly appreciated.

I haven't yet. I will do that. thanks for the suggestion, dimitriy ! — SFCha, Oct 18 '23 at 17:33

Noah · Answer 1 · 2023-10-20T15:13:39.657

In twangContinuous, the variable named treat is (maybe confusingly) a confounder, not the treatment variable, so you should not focus on it. It has nothing to do with the analysis except that it is a confounder that needs to be balanced by the weights with respect to the actual treatment variable, tss_0.

If you have a continuous treatment, no matter what you final model or comparison is, you need to estimate weights that ensure the treatment is independent of the covariates. GBM as implemented in twangContinuous is one way to estimate the weights, but not the only or even best way. Many options (including GBM) are available in WeightIt, so I'll demonstrate how to run an analysis using WeightIt.

First you need to estimate the balancing weights. We'll let A be the continuous treatment, x1, x2, and x3 be the confounders, and Y be the outcome. We'll use energy balancing which tends to work very well.

W <- WeightIt::weightit(A ~ x1 + x2 + x3, data = data, method = "energy")

Next you can assess balance using cobalt::bal.tab():

cobalt::bal.tab(W, un = TRUE)

If balance is achieved, you can move forward with modeling the outcome. Otherwise, try a different weighting method until you have achieved balance. We can model the outcome however you want, but here we'll use a restricted cubic spline.

#Bring weights into the dataset
data$weights <- W$weights
#Fit the outcome model
fit <- lm(Y ~ splines::ns(A, df = 4),
          data = data, weights = weights)

You can also add the covariates into the model for precision. Finally, we can estimate and plot the dose-response curve using functions in marginaleffects:

values <- seq(8, 20)
p <- avg_predictions(fit,
                     variables = list(A = values),
                     vcov = "HC3",
                     wts = "weights")
library("ggplot2")
ggplot(p, aes(x = A)) +
  geom_line(aes(y = estimate)) +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high),
              alpha = .3) +
  labs(x = "A", y = "E[Y|A]") +
  theme_bw()

You can compare the average outcomes for less than 12 vs greater than or equal to 12 by creating a new variable and examining the average values of the outcome between those two groups.

data$treat_12 <- ifelse(data$treat < 12, 0, 1)
avg_predictions(fit,
                by = "treat_12",
                vcov = "HC3",
                wts = "weights",
                hypothesis = "pairwise")

Finally, we can use a similar method to test the difference between having less than 12 years of education and each individual value of the treatment greater than or equal to 12.

data$treat_12b <- ifelse(data$treat < 12, 0, data$treat)
avg_predictions(fit,
                by = "treat_12b",
                vcov = "HC3",
                wts = "weights",
                hypothesis = "reference")

So, from this analysis, you get the dose-response curve and a test for whether on average having a treatment value less than 12 yields a different outcome on average from having a treatment value of 12 or greater. We did all this by respecting the original scale of the treatment and creating auxiliary variables to supply to avg_predictions() to get the quantities we want.

Now I will warn you that your question is poorly defined. What if there is a huge difference between people with 8 years of eduction and those with 11 years of education? Does it really make sense to lump all those people together in a single group? Maybe for a simple comparison, but the dose-response curve is the only output that fully captures the relationship you are studying.

I greatly appreciate your thorough explanation. I'm going to give this a shot. While it's not my primary identification strategy, a reviewer has requested some treatment effect estimators like PSM. I'm just thinking of checking out the continuous treatment effect as well. Thanks again! — SFCha, Oct 20 '23 at 13:43

Does estimating continuous treatment effects make sense in this case?

1 Answers1