I have some data I am working with, and I am curious if I am able to combine p-values from a paired t-test for CpG sites in the genome using Fisher's Method to get one p-value for each unique gene. Linked here is the Wikipedia page for Fisher's Method. I understand that an assumption of the method used is that each individual p-value being combined must be independent. I'm relatively new to biostatistics, so I'm curious if using CpGs from the same gene would violate this assumption.
Asked
Active
Viewed 80 times
8
-
Why do you want to melt CpG values? Do you want to know if an specific gene/region is more phosphorylated than others? What are your hypothesis (you tell about a t-test for the CpG sites? Are you testing if they are phosphorylated vs not or more than the rest of the genome or...? – llrs Aug 11 '17 at 10:54
-
1In a nutshell, your intuition is correct: they are not independent. – Konrad Rudolph Sep 12 '17 at 10:05
2 Answers
6
Methylation levels have high local correlation, so Fisher's method would be problematic. Having said that, you have no reason to use Fisher's method after a paired t-test. A paired t-test will give you a single p-value per gene, which is what you want. Do be sure to only include CpG with some minimal coverage in both group.
Devon Ryan
- 19,602
- 2
- 29
- 60
1
What is typically done in methylation analysis is to assess the islands of methylations.
Check this workflow, in the section linked it uses some predefined islands for instance. I am no expert on this area but you could asses if the islands or certain regions are more methylated than expected.
llrs
- 4,693
- 1
- 18
- 42