1

I need to apply a corr.test on all my 4 variables based on a sliding window of 4 and step 1. I could do it for 2 variables at a time, is it possible to do it on 4 variables?

library("gtools")
library("psych")
pos <- subs$POS
result <- running(subs[,2], subs[,3], fun=corr.test,  width=4, by=1)
n <-length(pos)
intervals <- paste(pos[1:(n-3)],pos[4:n], sep =':')
names(result) <- intervals

subs <- structure(list(POS = c(92, 92, 94, 94, 101, 103, 105, 106, 107, 
113, 114, 120, 120, 122, 132, 136, 140, 143, 143, 146), var1 = c(1, 
6, 1, 1, 3, 1, 1, 2, 10, 1, 1, 5, 1, 1, 5, 8, 2, 1, 1, 17), var2 = c(11355.6578947368, 
11355.6578947368, 11347.8157894737, 11347.8157894737, 11450.0789473684, 
11478, 11473.1052631579, 11479.7105263158, 11495.1315789474, 
11642.6052631579, 11691.2631578947, 11612.0526315789, 11612.0526315789, 
11497.7368421053, 11580.1578947368, 11421.6842105263, 11288.1052631579, 
11288.0263157895, 11288.0263157895, 11278), var3 = c(1.37171511940581, 
1.37171511940581, 1.38015533583543, 1.38015533583543, 1.38285457974173, 
1.38467221575929, 1.38728237171707, 1.38800712625898, 1.38790768043085, 
1.39036602407455, 1.39189385417378, 1.39598161236479, 1.39598161236479, 
1.39737128761462, 1.38302756354476, 1.37244652743204, 1.3632470398133, 
1.35191834216266, 1.35191834216266, 1.33779404050067), var4 = c(1.04285684958464, 
1.04285684958464, 1.04296690407379, 1.04296690407379, 1.04244689713763, 
1.04242150761792, 1.04233095775203, 1.0422836372936, 1.04225281778793, 
1.04210777961787, 1.04202037841291, 1.04165492323573, 1.04165492323573, 
1.04161860236611, 1.04065255428577, 1.03999715968399, 1.03927165790059, 
1.03890373057753, 1.03890373057753, 1.03861585865751)), row.names = c(NA, 
20L), class = "data.frame")

If I try this:

result <- running(subs[,2:5], fun=corr.test, 
                   width=4, by=1)

I get this error:

Error in seq.default(from = 1, to = length(run.elements), by = by) : 
  wrong sign in 'by' argument
user3224522
  • 1,018
  • 8
  • 18

2 Answers2

2

Define a function f that outputs a plain vector of whatever statistics that you want, e.g. p.adj, and then use rollapplyr with by.column = FALSE. (If the output of f were to have length 1 then replace rownames with names.)

library(zoo)

f <- function(x) corr.test(x)$p.adj # change this to whatever you want
r <- rollapplyr(subs[-1], 4, f, by.column = FALSE)
rownames(r) <- intervals  # intervals defined in question
r

giving this matrix whose rows are intervals and whose columns are statistics:

             [,1]        [,2]       [,3]       [,4]       [,5]       [,6]
92:94   1.00000000 1.000000000 0.00000000 1.00000000 0.00000000 0.00000000
92:101  1.00000000 1.000000000 1.00000000 1.00000000 0.05655636 1.00000000
94:103  1.00000000 1.000000000 0.07544652 1.00000000 0.05425068 0.18885878
94:105  1.00000000 1.000000000 0.56802781 1.00000000 0.09806659 0.56802781
101:106 1.00000000 1.000000000 1.00000000 1.00000000 1.00000000 0.14343590
103:107 0.16535559 1.000000000 1.00000000 1.00000000 1.00000000 0.21105126
105:113 1.00000000 1.000000000 0.09584584 1.00000000 0.11065442 0.09584584
106:114 1.00000000 1.000000000 0.05139145 1.00000000 0.03790992 0.04056634
107:120 0.18617738 1.000000000 1.00000000 1.00000000 1.00000000 0.04339496
113:120 1.00000000 1.000000000 1.00000000 1.00000000 1.00000000 0.01435381
114:122 1.00000000 1.000000000 0.52583100 1.00000000 0.82197327 0.10782051
120:132 1.00000000 1.000000000 1.00000000 1.00000000 1.00000000 0.05212361
120:136 1.00000000 0.005975605 1.00000000 0.00346566 1.00000000 0.01157140
122:140 1.00000000 1.000000000 1.00000000 1.00000000 1.00000000 0.00542238
132:143 0.80881021 0.808810209 0.28263145 0.80881021 0.13299017 0.07881523
136:143 0.05864769 0.179665391 0.17966539 0.10236170 0.17437441 0.10236170
140:146 0.01366799 0.549918470 0.54991847 0.54991847 0.54991847 0.06677469
G. Grothendieck
  • 233,926
  • 16
  • 195
  • 321
  • thanks a lot! where can I add method="pearson", use = "pairwise.complete.obs", adjust ="BH" ? – user3224522 Apr 15 '21 at 11:57
  • why di we have 6 columns in the output? If I print r values I get 4 values for each var... – user3224522 Apr 15 '21 at 12:07
  • I did read the answer, it wasnt clear why do we have 6 values for p.adj and what each column stands for. – user3224522 Apr 15 '21 at 12:15
  • 1
    Presumably they are the adjusted p values for each of the 6 pairs of columns. I am assuming that is the most likely set of statistics that you are after but you are free to define f in any way you like to get any statistics you want. In ?corr.test it says they correspond to the upper triangular part of an nxn matrix (here n = 4). See the description of the p output in ?corr.test. – G. Grothendieck Apr 15 '21 at 12:41
1
library(corrplot)
cor_subs <- cor(subs)
cor_subs

Output:

            POS       var1       var2       var3       var4
POS   1.0000000  0.3256234 -0.1769368 -0.5323549 -0.9662176
var1  0.3256234  1.0000000 -0.1890354 -0.4210894 -0.3612658
var2 -0.1769368 -0.1890354  1.0000000  0.8202677  0.4028399
var3 -0.5323549 -0.4210894  0.8202677  1.0000000  0.7257573
var4 -0.9662176 -0.3612658  0.4028399  0.7257573  1.0000000

With significance level:

rcorr(cor_subs, type = c("pearson","spearman"))

Output:

       POS  var1  var2  var3  var4
POS   1.00  0.70 -0.66 -0.86 -0.99
var1  0.70  1.00 -0.81 -0.88 -0.76
var2 -0.66 -0.81  1.00  0.94  0.75
var3 -0.86 -0.88  0.94  1.00  0.92
var4 -0.99 -0.76  0.75  0.92  1.00

n= 5 


P
     POS    var1   var2   var3   var4  
POS         0.1844 0.2242 0.0585 0.0010
var1 0.1844        0.1001 0.0475 0.1333
var2 0.2242 0.1001        0.0184 0.1420
var3 0.0585 0.0475 0.0184        0.0255
var4 0.0010 0.1333 0.1420 0.0255  

Adding a plot

corrplot(cor_subs, method="number", type="lower")

enter image description here

TarJae
  • 43,365
  • 4
  • 14
  • 40
  • 1
    Great solution. I think this is the one she/he was looking for. And also great visualization by the way. – Anoushiravan R Apr 15 '21 at 10:37
  • 1
    thanks a lot for this solution. But again the same problem, it does not take in consideration a sliding window based on the position, which is clearly shown in my code above. – user3224522 Apr 15 '21 at 10:38
  • In your first not edited edition of your question that was not so clear for me. Sorry for that. – TarJae Apr 15 '21 at 10:42
  • 1
    sorry, my fault, i put it in the code but did not spicfy in the question. – user3224522 Apr 15 '21 at 10:46