Isn't strong ignorability an incorrect assumption in complex causal structures?

Question

I have seen that in many papers/competitions for causal inference, the assumption of strong ignorability is made -

$P(Y^{x}\perp X\mid V)$, where $X$ is the treatment, $Y$ the outcome and $V$ indicates the set of all covariates (all other variables).

Example -

This assumption is called the assumption of "no unobserved confounders". But, does it not also make an assumption of the following type of structure (consider $V$ to be just one variable for below figure) -

What if the causal structure is as below (consider $V = \{L,Z\}$)? -

In this case, given $\{L,Z\}$ - $X$ and $Y$ are not independent. Note, it would be wrong to say such structures do not appear. In data science we can easily face complex causal structures, where conditioning on all variables would open up one or more biasing path(s).

Given this context, the following are my queries -

In the presence of such complex structures, does the condition $P(Y^x \perp X\mid V)$ not fail?
If the strong ignorability condition really fails in the face of such collider containing structures, why is it so rampant in competitions/research-papers? Particularly, I have seen this in social-science, healthcare etc. Is it because such structures are not expected to arise in those fields, because only known confounders are included in the dataset?
The papers/competitions given above propose ideas for good estimators, or benchmarks for testing those. For someone who expects to face such complex structures, should they ignore the results of these resources that make such assumptions? Or should they just replace "all covariates" with "all confounders"? After all, if the causal structure is clearly known, one can always find the respective confounders for the effect of $X$ on $Y$. And then use the best estimators the papers propose/benchmark.

score 8 · Accepted Answer · edited Aug 27 '23 at 18:29

The assumption of strong ignorability is that there exists a set of variables $W$, possibly a subset of all measured variables $V$, such that $Y^X \perp X \mid W$. It does not say that $Y^X \perp X \mid V$, i.e., that the potential outcomes are independent of treatment given all measured variables $V$. To meet this assumption, one has to find the set $W$. In your example, $W$ consists only of $L$ and does not include $Z$. So, strong ignorability is not met when using $W = \{L, Z\}$ but it is met when using $W = \{L\}$. We call $W$ a "sufficient adjustment set". Note that strong ignorability says nothing about how to construct or find $W$ or which sets of variables are allowable to satisfy the assumption. It is merely the assumption that there is a set of such variables. Estimators that rely on strong ignorability (and not all estimators of causal effects do) then use $W$ to estimate the effect.

The theory used to identify which variables form a sufficient adjustment set, i.e., which set of variables to include in $W$ to meet strong ignorability, is DAG theory. DAG theory says not to include $Z$ in $W$ precisely because doing so opens up a backdoor path; conditioning on a collider induces a non-causal association between the antecedents of the collider.

In an ideal observational study, a researcher should identify a set of variables to collect that form a sufficient adjustment set. In practice, researchers often have a dataset with many variables already measured, and they have to decide which ones among them belong in $W$, which do not belong in $W$ (e.g., colliders), and whether there is a set of variables that forms a sufficient adjustment set. My sense is that most studies do this to some extent, which is why you might not have to worry about studies being invalidated by conditioning on colliders. One way to avoid conditioning on colliders is to only condition on variables measured before treatment (i.e., "pre-treatment covariates"); these cannot be caused by either the treatment or the outcome and so are less likely to induce a violation of strong ignorability (though pre-treatment colliders of unmeasured pre-treatment causes of the treatment and outcome can still be a problem; this is called "M-bias").

Thanks for the answer Noah. On a side note, I mentioned in point 3 that many of the benchmarking efforts are towards comparing causal effect estimators. I have posted a related query asking about any such efforts towards causal discovery in the face of partial knowledge of DAG being available. In fact, my search started for such causal discovery related benchmarking efforts, but I found causal estimator evaluation frameworks instead. Hence this strong ignorability assumption statements sounded a bit confusing to me. — Anirban Chakraborty, Aug 28 '23 at 16:28

score 5 · Answer 2 · answered Aug 27 '23 at 15:52

I guess the answer to 1. is yes. As for 2., at least in social sciences, the idea behind this condition is that conditional on covariates -- i.e. observable pre-treatment characteristics -- the treatment allocation is as good as random. So, the key part is pre-treatment (those typically include gender, age, etc., something that has been realized before the treatment), hence treatment cannot affect those characteristics (only the opposite direction is possible). So, this implicitly rules out the collider structure you described. Hope this clarifies the answer to 2.

Post-treatment ``covariates'' are typically known in the literature as mediators. You can look for papers about mediation analysis online, but generally sufficient conditions for the identification of treatment effects that account for mediators are stronger (e.g., a so-called sequential ignorability assumption) see Imai et al. (2010) for details.

Isn't strong ignorability an incorrect assumption in complex causal structures?

2 Answers2

Linked