4

One of the benefits of DAGs is that they openly state the causal assumptions a researcher is making, allowing for greater transparency. This is nice in theory. However, in practice, the DAGs I have personally created are a tangled web of nodes and arrows. Visually speaking, when the complexity of a DAG increases, I find that the transparency benefit of DAGs decrease.

With that being said, does anyone have an idea of how to present the DAG (or information that the DAG generates) in a visually friendly manner. The idea is to include a DAG in a peer-reviewed journal that, by and large, does not publish overtly causal research. With that being said, a highly complex DAG is going to be a hard sell.

Just wondering how I can retain the visual transparency of a DAG without bombarding the reader with a highly complex figure. There is no prior example in my field for presenting DAGs in a journal article (beyond educational material where a researcher uses a very simple DAG as a teaching element), so I do not have precedent to guide me.

1 Answers1

4

How to simplify the visual presentation of a DAG

Note of caution: You can only reasonably simplify the presentation if some parts of the DAG can be grouped together, or if not all variables are (equally) important. If things can very easily be grouped, you may want to check if you are using the right level of representation. If not everything is (equally) important, check if you really need/want to use all variables.

Once you have decided on a set of variables, here are some strategies to simplify the visual presentation of a DAG connecting them.

1. Multi-dimensional variables Say you have 10 variables $(X_1, ..., X_{10})$. Perhaps the first 5 describe one concept (e.g., health indicators), and the other 5 another (e.g., education measures). If the role of variables within the groups is similar enough, you could present them as two high-level variables $H$ (health) and $E$ (education).

2. Group by DAG position Another way to reduce variables is to group by their position in the DAG.

For example, if you have multiple confounders, you can simplify by using a placeholder that indicates that there are multiple variables with the same function (this is frequently used for unobservable variables, of which we do not know how many there may be).

Example of visual variable grouping

[Comment: Some people (especially in stats) use variable names without a surrounding circle to denote observed variables, whereas variables inside circles are taken to be unobserved. Others place observed variables in circles (as seen above), and distinguish unobserved variables by name, color, etc., so there are definitely a few different notation conventions. It may be worth checking other research works using DAGs in your area, but long as it's consistent, it should be fine.]

3. Other visual aids You could visually group variables into a containing shape, use colors, or different types of arrows or nodes. This could be used for example to delineate context variables from model variables.

Non-visual presentation

If your DAG is really big (e.g., in the 100s of nodes), it might make more sense to provide it in a machine-readable format like an edge list or adjacency matrix.

Scriddie
  • 2,244
  • 6
  • 13
  • 2
    +1 Brian Lookabaugh, note that "grouping variables together" does not mean "omitting the variables" even if the variables are implicated in backdoor path confounding, etc. All the grouping does is simplify communication of the structure of specific causal relationships. For example, if $L_1, L_2, L_3,$ and $L_4$ all confound a relationship between $A$ and $Y$ because they are all individually direct causes of both $A$ and $Y$, you need not draw all 8 arrows from the $L$s to $A$ and $Y$, but can label a single node as "$L_1, L_2, L_3,$ and $L_4$" with one arrow to $A$ and one arrow to $Y$. – Alexis Mar 17 '23 at 17:55