I am working on 4 different plants. I have the (RNA-Seq) data from sequencing. I look for two events E1 and E2, say, at certain positions in their genome. Let's say E2 is the common one and E1 is a special event. The positions I look for are identical across all 4 (in the genome). And let's say I see that the observation I have for 1 particular position is:
P1 P2 P3 P4
E1 0 20 0 17
E2 100 80 100 120
Here, P1 through P4 refers to plants and E1 and E2 refers to the events. E2 is more common. So, my objective is to actually check if E1 occurs more often in one or more plants than in the others. If they occur at the same proportion in all, of course it is not interesting to me.
I have 2 questions: (I have already asked question 1 before but didn't get an answer)
Will a fisher test be right for this problem?
I hypothesize (Null) that the proportion of
E1is not different in occurrence between the 4 plants. Now, I set out to find if the proportion I have here is by any means significantly different. I use R,fisher.test()and I get p-value=2.5e-10. So, I reject my Null hypothesis for this case because I find strong evidence against it.Sometimes, I have an observation like this,
P1 P2 P3 P4 E1 0 20 0 17 E2 0 80 0 120then
fisher.test()gives me a p-value of 0.147. So, I don't reject the Null hypothesis. However, from a biological point of view, I would consider this significant. Howeverfisher testanswers the question I originally asked. I guess the proportion0/0for P1 and P3 are not useful (or not used). So, my question is: Is it possible to modify the test such that it is sensitive even ifE1andE2are0in 1 or more plants for a particular observation?
Having thought a bit to frame this post, I guess, in that case I have to ask a different question.