I have seen that in many papers/competitions for causal inference, the assumption of strong ignorability is made -
$P(Y^{x}\perp X\mid V)$, where $X$ is the treatment, $Y$ the outcome and $V$ indicates the set of all covariates (all other variables).
Example -
- Estimating individual treatment effect: generalization bounds and algorithms
- ACIC Competition (Dataset generation section)
- RealCause: Realistic Causal Inference Benchmarking
This assumption is called the assumption of "no unobserved confounders". But, does it not also make an assumption of the following type of structure (consider $V$ to be just one variable for below figure) -
What if the causal structure is as below (consider $V = \{L,Z\}$)? -
In this case, given $\{L,Z\}$ - $X$ and $Y$ are not independent. Note, it would be wrong to say such structures do not appear. In data science we can easily face complex causal structures, where conditioning on all variables would open up one or more biasing path(s).
Given this context, the following are my queries -
- In the presence of such complex structures, does the condition $P(Y^x \perp X\mid V)$ not fail?
- If the strong ignorability condition really fails in the face of such collider containing structures, why is it so rampant in competitions/research-papers? Particularly, I have seen this in social-science, healthcare etc. Is it because such structures are not expected to arise in those fields, because only known confounders are included in the dataset?
- The papers/competitions given above propose ideas for good estimators, or benchmarks for testing those. For someone who expects to face such complex structures, should they ignore the results of these resources that make such assumptions? Or should they just replace "all covariates" with "all confounders"? After all, if the causal structure is clearly known, one can always find the respective confounders for the effect of $X$ on $Y$. And then use the best estimators the papers propose/benchmark.

