3

Background

Seeking a clear and authoritative explanation of a key concept of Rubin's Potential Outcomes Framework that is causing this hapless OP enormous grief. While the necessity to distinguish between "Observed Outcome(s)" and "Potential Outcome(s)" is clear (to me), the definitions and descriptions of Potential Outcomes are far, far from clear. To be clear: This question is not about counterfactuals.

Here is an excerpt from one of Rubin's papers (source: Wikipedia):

Rubin Quote

As reflected in the quote above, Rubin (et al) use various forms of the trope "...what would have happened if {TREATMENT | CONTROL}". A frequently used example is something to the effect of: If the Subject $i$ took an aspirin $E(x)$ at time $t_1$ the headache would have gone away $(Y=1)$ by time $t_2$. However, it is possible for Subject $i$ to take an aspirin $E(x)$ at time $t_1$ yet the headache does not go away $(Y=0)$ by time $t_2$. It is also possible that no treatment $C(x)$ at time $t_1$ could have the outcome $Y=1$ (headache went away) at $t_2$.

As written, it seems like Potential Outcomes ignore the possibility the treatment (e.g., $E(x)$) might not produce the desired/expected outcome (e.g., No Headache or $Y=0$) or "No Treatment" $C(x)$ DOES achieve the desired/expected outcome $Y=1$ — but I do not find these seemingly obvious "Potential Outcomes" reflected in the literature (e.g., definitions, descriptions, examples).

Key Question

Assuming for a moment a strictly binary outcome ($Y=1$ or $Y=0$), does Rubin et al's concept of "Potential Outcome" include or exclude both "Treatment $E(x)$ might work" $(Y=1)$ AND "Treatment might not work" $(Y=0)$ as well as "Control $C(x)$ might/might not work"?

  • Many thanks to ben, noah, and marjolein for the replies! Because of your individual & collective insights, I the outline of an answer is looming in the fog—but...just out of sight. Below is a Dropbox link to an image of a decision tree I crafted in an attempt to illustrate my confusion (RE: Potential Outcomes). If {any | all} of you have a moment to evaluate the diagram and elicuidate my error, I will be most appreciative; THANK YOU! https://www.dropbox.com/scl/fi/fytcjnin5ghpz09fbm5ei/Causality-Potential-Outcomes-Model-v20231002.png?rlkey=bs8dsspwsebaiv0kr3nhrce9g&dl=0 – Plane Wryter Oct 04 '23 at 14:24
  • Defining the variables as random variables and writing down the relationships as a structural equation model could be a good idea to answer the question. After all, a causal model is a statistical model with a causal interpretation. For your question the causal interpretation does not seem central - it would be enough to look at the statistical model to see that any of the combinations you ask about can indeed occur. – Scriddie Oct 05 '23 at 13:00

3 Answers3

2

Assuming for a moment a strictly binary outcome (Y=1 or Y=0), does Rubin's (et al) concept of "Potential Outcome" include or exclude both "Treatment E(x) MIGHT Work" (Y=1) AND "Treatment might NOT work" (Y=0) as well as "Control C(x) MIGHT/MIGHT NOT Work"?

It includes all of these possibilities. From the same page:

"$Y_{t}(u)$ is Joe's blood pressure if he takes the new pill. In general, this notation expresses the potential outcome which results from a treatment, $t$, on a unit, $u$."

In principle, $Y_{t}(u)$ can take any possible value that outcome $Y$ can take.

"Similarly, $Y_{c}(u)$ is the effect of a different treatment, $c$ or control, on a unit, $u$. In this case, $Y_{c}(u)$ is Joe's blood pressure if he doesn't take the pill."

In principle, $Y_{c}(u)$ can take any possible value that outcome $Y$ can take.

"Thus, this $Y_{t}(u)-Y_{c}(u)$ is the causal effect of taking the new drug."

Of course, we are faced with the not-so-easy-task of estimating $Y_{t}(u)$ and $Y_{c}(u)$. For both, a probability distribution can be estimated, which generally will (or at least can) assign non-zero probabilities to all possible values that outcome $Y$ can take.

1

The potential outcomes and the causal effect are random variables

Because the outcome variable at issue can be treated as a random variable it can include all manner of cases where a treatment might or might not work (to some specified probabilistic degree) and where the absence of a treatment might or might not work (to some specified probabilistic degree). To see this, let's continue the Aspirin/headache example using binary variables. Let $T$ denote the binary control variable for whether the user takes an Aspirin ($T=1$) or not ($T=0$) and let $Y$ denote the binary outcome at a later time for whether the user has a headache ($Y=1$) or not ($Y=0$). The effect of the treatment variable on the outcome is fully specified by the conditional probability distribution for $Y$ given $T$, which can be encapsulated by the conditional CDF:

$$\begin{align} F_0(y) &= \mathbb{P}(Y \leqslant y | T=0), \\[6pt] F_1(y) &= \mathbb{P}(Y \leqslant y | T=1). \\[6pt] \end{align}$$

Under the Rubin potential outcome model, we consider these cases through separate random variables called the "potential outcomes", which are $Y_0 \sim F_0$ and $Y_1 \sim F_1$. The first represents the potential outcome that will accrue if the user does not take Aspirin and the second represents the potential outcome that will accrue if the user takes Aspirin. The causal effect of Aspirin on the headache is the random variable $E \equiv Y_1 - Y_0$. In this simple case, since $Y_0$ and $Y_1$ are binary, we have three possible outcomes for the causal effect, with respective probabilities:

$$\begin{align} \mathbb{P}(E = -1) &= \mathbb{P}(Y_0 = 1, Y_1 = 0) \\[6pt] &= \mathbb{P}(Y=1|T=0) \cdot \mathbb{P}(Y=0|T=1), \\[6pt] \mathbb{P}(E = 0) &= \mathbb{P}(Y_0 = 0, Y_1 = 0) + \mathbb{P}(Y_0 = 1, Y_1 = 1) \\[6pt] &= \mathbb{P}(Y=0|T=0) \cdot \mathbb{P}(Y=0|T=1) + \mathbb{P}(Y=1|T=0) \cdot \mathbb{P}(Y=1|T=1), \\[6pt] \mathbb{P}(E = 1) &= \mathbb{P}(Y_0 = 0, Y_1 = 1) \\[6pt] &= \mathbb{P}(Y=0|T=0) \cdot \mathbb{P}(Y=1|T=1). \\[6pt] \end{align}$$

The causal effect $E=1$ means that the Aspirin works ---i.e., it causes the headache to go away. The causal effect $E=0$ means that the Aspirin doesn't work ---i.e., there is no difference on the headache from taking the Aspirin. The causal effect $E=-1$ means that the Aspirin causes the headache to last longer ---i.e., the headache would have gone away with no Aspirin but it stayed because of the Aspirin.

Ben
  • 124,856
0

Potential outcomes are characteristics of an individual. Each unit has a potential outcome for each level of treatment. In this case, there are two levels of treatment $T$, $E$ (treated) and $C$ (control). So, each individual $i$ has two values, $Y^E_i$ and $Y^C_i$, which are the potential outcomes under treatment and control. You don't need to understand what these mean philosophically to use them. You only need to know the following to understand their position in causal inference:

The causal effect $\tau_i$ for unit $i$ is defined as$$ \tau_i = Y^E_i - Y^C_i $$

For units enrolled in a study comparing $E$ and $C$, the observed outcome $Y$ is a function of treatment and the potential outcomes:$$ Y_i = I(T_i=E)Y^E_i + I(T_i=C)Y^C_i $$

where $I(.)$ is the indicator function ($1$ if its argument is true, $0$ otherwise).

There is no concept of "treatment might work or control might work". The comparison is between treatment and control at time 2, not between treatment at time 2 vs time 1 or between control at time 2 vs control at time 1. So the question is whether the potential outcome 9at time 2) under treatment (i.e., the outcome if an individual received $E$) differs from the potential outcome (at time 2) under control (i.e., the outcome if an individual received $C$).

So when we ask "Does treatment (vs. control) work for individual $i$?", we are asking if $\tau_i = 0$. $\tau_i$ can be anything; if $Y^E_i$ is the same as $Y^C_i$, then we say treatment as no effect, and if $Y^E_i$ is different from $Y^C_i$, we say treatment does have an effect. Again, the comparison sin't between the post-treatment period (time 2) and the starting point (time 1); it is between the treatment and control conditions at time 2 for the same individual at time 1.

You may be confused by the idea that if someone has a headache at time 1 and then doesn't have a headache at time 2, whatever happened between those caused the headache to go away, suggesting that whatever happened affected the headache and therefore should be part of the definition of the causal effect. But what you are really describing is a separate treatment: the passage of time. So really you have identified two treatment variables: time, and exposure to the medicine (which we have been calling $T$). It is possible that the medicine doesn't do anything, but people naturally recover from headaches, which suggests there is an effect of time. People who would receive the treatment recover, and people who would receive the control recover. It is not that case that the treatment "works" and the control "works"; rather, time works, but treatment (vs. control) doesn't work.

I think this is a much more complicated version of this framework and it is far more useful to think just about the difference in the potential outcomes under treatment and control at time 2, forgetting about time 1. That is, the difference between time 2 and time 1 is not relevant here; what is relevant is the difference between treatment and control at time 2. That is how the potential outcomes are defined and are to be understood and that is how we in a nontechnical sense understand causal effects to operate. I think for some people, the two time points make it clear that an event is occurring (i.e., exposure to or non-exposure to treatment), which yields the eventual outcomes, but the concept of a causal effect does not require the "initial" time point; just an individual who either receives or doesn't receive treatment and their outcomes under those conditions.

I also want to point you to my answer here that describes potential outcomes and their relationship with observed outcomes.

Noah
  • 33,180
  • 3
  • 47
  • 105