Can I use a permutation test on timeseries data?

Question

What I am trying to do:

I am currently doing analysis on neuronal calcium imaging data.

In particular, I have two things:

A time series that represents the amount of calcium within a neuron
A boolean time series that encodes whether an activity from a mouse is taking place

I want to see if a specific neuron is activated when the defined action is taking place.

The method I want to use:

One technique I read in various paper consists in building a linear classifier on the calcium time series, converting it into a boolean array (1 if it is above the threshold, 0 if it is below). Then this boolean array from calcium is compared to the boolean array that encodes the activity of interest, computing a confusion matrix. This is done to see if the elevation in calcium concentration encodes for the activity.

In particular, we span all the possible thresholds (from a minimum to a maximum) for the calcium imaging data. From the various confusion matrices we can then build a ROC curve and use its area as a performance metric for that particular neuron.

The problem:

In the various papers they then wanted to see whether the results were statistical significant or if they were obtained by pure chance. They tested the significance by circularly permuting the calcium time series (they select a random index "i" of the timeseries, and inverted the timeserie before "i" with the timeseries after "i"). They claim to do the permutation in this way to better preserve the physiological structure of the timeseries.

The thing I do not understand is why this permutation test is applicable to timeseries data? I read about the exchangeability hypothesis that needs to be satisfied before applying this permutation method, but this to me does not seem to be the case... indeed the calcium sample after highly depends on the calcium sample before it. And even if we restrict our analysis to do a circular permutation we have a discontinuity point in the middle...

Questions:

Is this analysis doable or is the exchangeability hypothesis holding this analysis back?
If this analysis cannot be done, are there alternative ways to tests the significance if I do not know the underlying null distribution?

Reference article

https://pubmed.ncbi.nlm.nih.gov/31230711/

See “Analysis of Single Cell Responses During Behavior”

What about a block bootstrap? That seems vaguely similar to what you describe --- by the way, what is the length of the time-series? — kjetil b halvorsen, Dec 16 '22 at 18:03
When you write "inverted the timeseries before "i" with the timeseries after "i"", what precisely do you mean by "inverted"? — jbowman, Dec 16 '22 at 18:10
Because this approach is nearly equivalent to permuting the mouse activity indicator series, it would be interesting to know more about that series. For instance, does it consist of occasional isolated 1's, or could it be characterized as sequences of blocks of zeros, blocks of ones, and so on, or something more complicated? — whuber, Dec 16 '22 at 18:19
Please edit the question to include links to the papers that describe the circular permutation method. Also, note that you might be better off modeling the calcium concentration as a continuous function rather than playing with thresholds, which can get you into trouble when you choose the threshold based on the data (although I do appreciate the idea of trying to mimic a neuron's presumably all-or-none response). Please address the issues raised above by editing the question, as comments are easy to overlook and can be deleted. — EdM, Dec 16 '22 at 18:58
One more thought: this seems mostly to address the issue of what might be considered technical replicates within a neuron or mouse, while what's typically of more interest is the consistency of results among neurons or mice. You might not really need this within-neuron significance estimation at all. If you do need such within-neuron significance estimates, please edit the question to explain why. — EdM, Dec 16 '22 at 19:01
I can make an example for explaining a bit better what inverting means. If I have a timeseries “123456789” and I make a circular permutation at index 4 than the permuted timeseries will be “567891234” — Luca, Dec 17 '22 at 16:07
The activity of the mouse is made up of blocks of 0 and blocks of 1 as the mouse cannot instantaneously change what it is doing at a particular moment. So I do not have isolated ones — Luca, Dec 17 '22 at 16:08
If the data are long enough, consider cyclically permuting the mouse series, but allowing breaks only where the value changes. This will preserve much of the correlation structure. BTW, your example of "inverting" remains obscure. How does "inversion" differ from a circular permutation, if at all? — whuber, Dec 17 '22 at 17:02
The inversion is the cyclical permutation. I used just another term to avoid repeating myself. Sorry for the confusion — Luca, Dec 17 '22 at 17:48

EdM · Accepted Answer · 2022-12-17T22:02:01.183

The individual-neuron calcium signals described in the Kingsbury et al. paper were evaluated "during behavior events" (maybe better described as behavior epochs) within mice. This was done in by placing two mice to face each other in a tube, and determining periods showing behaviors of "push," "retreat" and "approach" for each mouse over time.

A continuous fluorescent measure of calcium concentration $(\Delta F/F)$ was evaluated across multiple "behavior events" with what seems to have been a standard AUROC (area under the receiver operating characteristic curve) approach. That was evidently done individually for each type of behavior event, over a series of different types of events and periods without the evaluated behaviors.

The circular permutations were done to estimate a random null distribution against which to evaluate the AUROC values within an individual neuron, to identify cells whose activity was associated with a behavior. 1000 circular permutations were done for each neuron.

A neuron was considered significantly responsive (⍺ = 0.05) if its auROC value exceeded the 95th percentile of the random distribution (auROC < 2.5th percentile for suppressed responses, auROC > 97.5th percentile for excited responses).

What's important in that application is to randomize the calcium signals among the behavior types and periods without the specified behaviors. I don't see that the (admittedly large) short-term correlations in the calcium signal over time would pose a problem, provided that the signals were adequately randomized among behaviors/lack-of-behavior. If you are considering a similar approach, that larger-scale randomization is what's critical.

Figure 5 shows that this approach was successful in identifying subsets of cells associated with increased or decreased activity for each of the 3 behaviors. After that classification into cell types based on individual AUROC values, further evaluation was based on continuous measures. For example:

For comparison of response characteristics across subject and opponent cells (Figures S8B and S8C), the response strength for each neuron and each behavior was calculated as the average z-scored $\Delta F/F$ activity during all behavior epochs of a given type. Response probability for each neuron and each behavior was calculated as the percentage of behavior events with average neural activity that exceeded 110% of the local baseline (increased by more than 10% above baseline), taken over the 10 s preceding behavior onset.

Those evaluations further supported the AUROC-based classifications of individual neurons into behavior-associated groups.

Addendum on exchangeability

The lack of exchangeability typical of time-series data doesn't invalidate this method, because of the random variables that need to be considered "exchangeable" here. From Wikipedia, an exchangeable sequence of random variables is one

whose joint probability distribution does not change when the positions in the sequence in which finitely many of them appear are altered.

In this application, what's important is the exchangeability of the observations among behavioral epochs over which evaluations are made: the periods of "push," "retreat," "approach," and "none" identified from mouse behavior. Within any of those individual epochs, the calcium measurements are certainly not exchangeable in time. What's needs for this type of study, however, is that the random variables estimated from the epochs (e.g., the mean $\Delta F/F$ values within individual epochs) are exchangeable. That's over a much different time scale.

For the type of exchangeability needed here, the joint distribution of $\Delta F/F$ values among epochs in a sequence of "approach, none, push" shouldn't be different than it would be if the order were instead "push, approach, none," for example. The structure within each epoch doesn't matter per se. Insofar as that's the case, then the circular permutation to serve as a null for $\Delta F/F$ values within epochs is valid. The circular permutation further maintains the short-term characteristics of the calcium time series even as the mapping between that time series and the behaviors is permuted, removing a potential problem with complete randomization of calcium observations.

Kingsbury et al., Correlated Neural Activity and Encoding of Behavior across Brains of Socially Interacting Animals, Cell 178: 429–446 (2019).

Thank you for your answer! I have understood what they did however my question was about why they did permutation tests if they were working with timeseries and not exchangeable data. See https://stats.stackexchange.com/questions/473076/permutation-tests-and-exchangeability/473108#473108 — Luca, Dec 17 '22 at 19:43
@Luca "values near each other in time are statistically related" (emphasis added) when you have time series, so on those time scales the values are non-exchangeable. But the time periods (behavior epochs) over which you are randomizing (via permutation) in this situation are far apart, much farther apart than the autocorrelations within each neuron's calcium signals. You will need to evaluate that matter for any new application of this approach. — EdM, Dec 17 '22 at 19:48
@Luca I added to the answer a more direct treatment of the exchangeability that's important to address in this type of study. — EdM, Dec 17 '22 at 21:59
Okay perfect, thank you a lot! Just one more small question. I am trying to see if the method of the cyclical permutations described has any connection with the method described in this other paper. If it has any connection to the Test 1 described there it may be helpful for me to better understand this topic. Do you think that they are connected? I think that the timeseries may encode or not a behavior and we are extracting that dependence with a linear classifier based on thresholds. Is it correct? https://www.jmlr.org/papers/volume11/ojala10a/ojala10a.pdf — Luca, Dec 18 '22 at 11:55
@Luca "Test 1" in the paper linked from your comment is based on permuting the class labels of a data set while keeping all the other case data intact. The circular permutation approach is different in detail although similar in intent. Shifting the calcium signals in time means there is no longer any necessary association between a class label for an epoch after randomization and the calcium data underlying any single original epoch before randomization. So with the cyclic permutation used here, the "other case data" aren't really kept intact. — EdM, Dec 18 '22 at 19:28
@Luca don't get too hung up on the terminology of "extracting that dependence with a linear classifier based on thresholds." That's just one of many ways to express the process of constructing an ROC curve from a continuous predictor. Any order-preserving transformation of a continuous predictor will provide the same AUROC. Behavior-specific neurons were identified as having particularly high or low AUROC values, based on continuous $\Delta F/F$ in this case, for classifying a behavior. — EdM, Dec 18 '22 at 19:43

Can I use a permutation test on timeseries data?

1 Answers1