Causal discovery for pairwise independent joint dependent variables
It is well known the mantra like "association do not imply causation".
However, in general, we can said that "causation imply association" (*). The problem is to show what kind of associations are implied by some causal assumptions. It seems me that any causal theory, in uncertain framework, try to answer to this question. It seems me that Pearl theory give most reliable answers (related: Criticism of Pearl's theory of causality).
The paper you cited is based on this theory and all causal discovery algorithm presented are based on the concept of d-separation (or d-connection) or related ones. Indeed any associational implications of causal assumptions come from this concept.
E.g. if one applies the Fast Causal Inference algorithm (Sec 3.2 of
this paper) to these variables, all edges are immediately removed
because they are all pairwise independent.
What's going on?
d-connection deals with conditional or unconditional dependencies but in your example no unconditional (marginal) dependencies exist. Moreover note that even unobserved confounders produce (spurious) dependencies among declared variables; but this is not your case.
In most cases all dependencies (and causal effects) come from marginal ones (direct causal effects), but in your case no marginal dependencies exist. For this reason all edges disappear and FCI algorithm fail.
(*)My previous sentence can be questioned at logical level and seems that can be built some special counterexamples. However it seems me that, at most, only some but not all presumed associations disappear. Moreover the practical relevance of those examples seems me poors. Read here for a discussion: Does causation imply correlation?
I realized that your case match precisely the particular case I had in mind. I talk about Carlos Cinelli example given here.
https://stats.stackexchange.com/q/301823
This case is quite special, I suspect that most common causal discovery algorithms cannot help you in such situations. The difference is that you do not have the DAG; you look for it but in your case any of the three variables can be put in the collider node. Observable data cannot help you.
Let me add something
I'm trying to represent X,Y,Z in a causal diagram.
Certainly, such a distribution, if represented causally, ... .
Take care, causal discovery algorithms can help you to bridge the gap but remember that causal structure, usually represented with DAGs or SEMs, is not a translation of joint distribution of observed variables. This is a clear violation of the demarcation line suggested by Pearl. This is a tremendous mistake (read previous link and those: Under which assumptions a regression can be interpreted causally? ; How would econometricians answer the objections and recommendations raised by Chen and Pearl (2013)?).
What is true is that for a given causal structure, and a specified joint distribution for structural errors, we can deduce a joint distribution for endogenous variables involved. However, in general, more than one distribution for endogenous variables is compatible with a given causal structure. This can depend from variations in the distribution of errors.
More important, several causal structures can share the same set of dependences (via d-connection/separation). Indeed, from this fact, emerge the concept of "Equivalent class". It proof mathematically, for the first time, the mantra "association do not imply causation".
Indeed your example is compatible with three DAGs, with independent exogenous errors. In particular you can put any variables in collision node. Those DAGs represent a Markov Equivalent Class.