0

I am currently developing a cluster extent permutation test on time series data. For that I wanted to sanity test to see whether there are any biases with my test when it is ran on null (i.e., no effect between conditions) data or not. My approach is as follows:

True observed procedure

  1. I have 2 groups.
  2. Each group has n subjects and each subject has 25 datapoints.
  3. I run 25 ttests at each datapoint between groups.
  4. I get 25 p-values.
  5. If a p-value is smaller than 0.05, I mark it as significant.
  6. Significant p-values can form clusters.
  7. I define cluster size as number of adjacent datapoints with significant (first-level) p-values (cluster size = 0 for non-significant ps, cluster size = 1 if there's only one p that's significant and its adjacent ones are not).

Permutation procedure

  1. I shuffle group labels randomly, preserving original sample sizes
  2. I perform the (true observed) procedure described above
  3. I get cluster sizes as described above, extract the largest cluster size and store it in a distribution
  4. I repeat the process 10000 times
  5. I get a distribution of maximal cluster sizes under the null

Finally

I assign p values of my true observed clusters (true obs. procedure #7) as proportion of clusters equal to or larger than those stored in the distribution of maximal clusters under the null (permutation procedure #5)

My sanity test

I shuffle group labels before anything else, i.e., before running the first-level ttest (true observed procedure #3).

I run everything as described above and repeat the whole process with 200 random shuffles of my "true observed" data.

I look at "true observed" cluster's pvalues (let's call these cluster ps under the null [NOT meaning the maximum cluster sizes of the permutation loop!])

I observe these cluster ps under the null to be smaller than 1 5% of the time - this is good and makes sense, since under the null the first-level ttests should only come out significant 5% at the time.

My Question

How can I assess whether cluster sizes are appropriately often flagged as being significant? My intuition tells me that it should be (ps referring to "cluster ps under the null"):

(ps <= 0.05 / ps < 1) = "what should be 0.05 with a valid test"

That is, out of all clusters that were at least of size 1 (i.e., p<1), what was the proportion of those clusters that were assigned a significant p-values (i.e., p<=0.05).

Is my reasoning correct?

Thank you for your time and effort!

Mah1510
  • 11
  • 1
    I don't understand. How are you doing a t-test on single datapoints? If you have two groups and 25 obervations, you could do one test. And dividing the results into signifcant and not sig. is arbitrary. And how are you going to cluster p values? – Peter Flom Jan 30 '24 at 14:11
  • 1
    Sorry if things were unclear! And thanks for your reply and effort.

    I have two groups with n participants / group. Each subject has 25 datapoints (I see that this was unclear in my post) I run separate ttests at each datapoint across groups.

    I get cluster p values by computing the proportion of maximal cluster sizes under the null that are bigger than or equal to the true observed cluster size.

    Let's say there were 3 consecutive ttests that were significant = cluster size = 3. How many of the 10k biggest clusters under the null were equal to or larger than 3 => p value of my cluster

    – Mah1510 Jan 31 '24 at 08:13

0 Answers0