I'm struggling with something that at first sight seemed simple to me but turned out no to be. I have a big df (112 samples x 58 characteristics), these characteristics are all binary (presence/absence, yes/no) coded as 1/0. Here's a random example:
condition<-c(rep(0:1,20))
char1<-sample(0:1,20,replace = T)
char2<-sample(0:1,20,replace = T)
char3<-sample(0:1,20,replace = T)
char4<-sample(0:1,20,replace = T)
char5<-sample(0:1,20,replace = T)
char6<-sample(0:1,20,replace = T)
char7<-sample(0:1,20,replace = T)
char8<-sample(0:1,20,replace = T)
char9<-sample(0:1,20,replace = T)
char10<-sample(0:1,20,replace = T)
df<-data.frame(condition,char1,char2,char3,char4,char5,char6,char7,char8,char9,char10)
I would like to see if there is some sort of correlation between condition_0 or condition_1 and any of the analysed characteristics.
I've thought of generating a proportion table and using a Chisq:
prop<-prop.table(data.matrix(df),2)
chi<-chisq.test(prop)
but I'm getting X-squared = NaN (with the real dataset, the example here gives an actual output). So I'm thinking this is not the right approach. I've also tried a logistic regression but the model doesn't converge and it also seems somewhat odd.
Does anyone have any suggestions on how to solve this? I'll be happy with just a method name.
Many many thanks!
Thanks for the reply and the clarification of these two concepts. I would like to test for correlation.
– Sushiroll Jul 25 '23 at 11:37