Causality between two binary time series

Question

I have the following sample of a big data frame:

Time (ms)  Signal_1  Signal_2
0          0         0
1          0         0
2          0         1
3          0         0
4          1         0
5          0         0
6          0         0
.          .         .
.          .         .
.          .         .
996        1         1
997        0         0
998        0         0

Signal_1 represents if a heart beat occurred in person X in Time i.

Signal_2 represents if a heart beat occurred in person Y in Time i.

Time (ms) is the Time i and the index of the data frame. Time = 0 represents the begin of the experiment. Time = 1000 represents the first second passed after the begin of the experiment.

Since the signals are nominal (boolean), how can I use VAR and Granger Causality to say if Signal_1 causes Signal_2?

Is there any way to calculate correlation between these binary time series data?

I posted an answer to a similar question here: https://stats.stackexchange.com/a/511416/135759. Let me know if it helps you! — Maximilian Aigner, Feb 26 '21 at 19:51
You are interested in correlation or causality? Correlation doesn't imply causation. — Tim, Feb 04 '22 at 13:43

score 0 · Answer 1 · edited Aug 25 '23 at 15:44

0

For binary data correlation does not suit well, but there are many similarity indexes that you can use like the Jaccard index.

edited Aug 25 '23 at 15:44

Galen

8,442

answered Nov 20 '19 at 22:08

Giulia Martini

354

Provided that neither variable is degenerate, the Pearson's correlation is defined on binary variables. Numerator is the independence gap and the denominator normalizes the scale according to the Cauchy-Schwarz inequality. It isn't inherently problematic to compute correlations on binary data, but does require some attention towards the mathematics. – Galen Aug 25 '23 at 15:43

score 0 · Answer 2 · answered Aug 25 '23 at 15:33

Causal Models

Supposing causal sufficiency, Markov property, and faithfulness, there are some simple options to get started with. You can expand on these samples by having causes that jump across multiple time points, but I have not shown these here. You could also suppose that each variable is not a cause of its next value but I find that unlikely in practice (go ahead and explore that option if it is plausible in this case).

You could use do-calculus to design an experiment to further investigate which of these models appears to be correct.

Correlation

You could compute a correlation score. I think computing the covariance (which for binary variables is the independence gap) is simpler and has bounds $\text{Cov}[S_1, S_2] \in \left[ -\frac{1}{4}, \frac{1}{4}\right]$.

Causality between two binary time series

2 Answers2

Causal Models

Correlation