8

I have some data I am working with, and I am curious if I am able to combine p-values from a paired t-test for CpG sites in the genome using Fisher's Method to get one p-value for each unique gene. Linked here is the Wikipedia page for Fisher's Method. I understand that an assumption of the method used is that each individual p-value being combined must be independent. I'm relatively new to biostatistics, so I'm curious if using CpGs from the same gene would violate this assumption.

user1309
  • 81
  • 1
  • Why do you want to melt CpG values? Do you want to know if an specific gene/region is more phosphorylated than others? What are your hypothesis (you tell about a t-test for the CpG sites? Are you testing if they are phosphorylated vs not or more than the rest of the genome or...? – llrs Aug 11 '17 at 10:54
  • 1
    In a nutshell, your intuition is correct: they are not independent. – Konrad Rudolph Sep 12 '17 at 10:05

2 Answers2

6

Methylation levels have high local correlation, so Fisher's method would be problematic. Having said that, you have no reason to use Fisher's method after a paired t-test. A paired t-test will give you a single p-value per gene, which is what you want. Do be sure to only include CpG with some minimal coverage in both group.

Devon Ryan
  • 19,602
  • 2
  • 29
  • 60
1

What is typically done in methylation analysis is to assess the islands of methylations.

Check this workflow, in the section linked it uses some predefined islands for instance. I am no expert on this area but you could asses if the islands or certain regions are more methylated than expected.

llrs
  • 4,693
  • 1
  • 18
  • 42