difference between two sets of pvalues

Question

I am performing a biological experiment, where I am trying to capture potential mediation effects from some genes.

I have 150 significant results, containing 30 tissues with different sample size.

For those instances, I rerun the regression analysis while adding some genes to observe if the pvalue are going up or down.

Now that I have the two sets of p-values, original and conditional ones, is there any test I can perform to formally test whether the difference between these two sets is significant?

Thanks!

Welcome to Cross Validated! To get a useful answer, you will need to provide more details in your question about the underlying scientific question, the experimental design, what you measured, and the analyses you've performed so far. From what's in this first version of the question, it seems that the approach might suffer from some problems like those of automated model selection, but there aren't enough details in your question to know for sure. Please provide that information by editing the question, as comments are often overlooked. — EdM, Jul 13 '23 at 20:21
Hi dsbo, maybe you can shortly describe why you want to test that the two sets of p-values are different? I have suggested a method how you can test that the p-values are the same before and after taking extra genes into account, without asking if this makes sense at all, but there seems to be a lot of concern about that. I also have a relatively minor concern: in the first, original set of p-values, you will never get p-values over your chosen threshold, but this is possible in the second set. So there will always be a difference between the two sets. — Ute, Jul 13 '23 at 21:19
Since your p-values seem to come from a linear model with the genes in question as independent variables, I have to agree with the experts and say that it is very questionable to try to test differences between the two sets of p-values. A test result could then just be seen as a kind of measure of distance between two sets, but since there is no absolute reference, it would not make sense. I will meanwhile delete my current attempt to answer. As you are interested in the changes, you could try to summarize them in a different way, for example graphically. — Ute, Jul 14 '23 at 09:29
P. S. I don't deem your question meaningless! It is interesting to look at p value changes when adding variables to a model, no doubt. // here is a generic discussion of testing two sets of p values:https://stats.stackexchange.com/q/505215 — Ute, Jul 14 '23 at 10:01
@Ute That discussion uses the p-values only as the source of a dataset of paired values and concerns testing whether the components differ in some sense. Although perhaps related, it's not the same question and isn't addressed in the same way. I still have to ask, since you deleted your post and the comments with it: why not just recommend an omnibus F test? That's the standard procedure here. — whuber, Jul 14 '23 at 16:57
@whuber, it seems you understand the question in the same way as I: they want to see what happens to the 150 p values for the coefficents associated with 150 genes. If we use an F test for the two models, with and without, we get information about the fit of these models, not about the 150 coefficients of interest. One of the reasons I discarded my answer is, that a simple t test would do, at least if the raw p values are compared, and yes, this is also an F test. Another reason: it would be more clever to state a proper null and test that. I am considering to undelete it, after rewrite Ute — Ute, Jul 15 '23 at 05:15

Rachel Altman · Answer 1 · 2023-07-14T16:26:27.950

6

The test you're proposing isn't meaningful.

Your original p-values correspond to tests of the effects of the corresponding genes without adjusting for the effects of the other genes, while your new p-values correspond to tests of the effects of the corresponding genes after adjusting for the effects of the other genes. These tests are entirely different.
Hypothesis tests should always concern a parameter of interest. Which parameter(s) are you interested in? P-values are not parameters. EDIT: Note that general distributions can be viewed as a collection of an infinite number of parameters. Therefore, hypothesis tests of distributions are also valid.

edited Jul 14 '23 at 16:26

answered Jul 13 '23 at 20:32

Rachel Altman

1,058

2

I disagree with your point 2. Statistical tests are not restricted ti parametric models and hypothesis. You can also test if two samples come from the same distribution, without further specification – Ute Jul 14 '23 at 09:23
1

@Ute: I agree that we can form hypotheses about distributions. If we think of general distributions as being characterized by an infinite number of parameters, then we are both right. :-) I have edited my post to address your point. – Rachel Altman Jul 14 '23 at 16:27
1

A truly salomonic solution, Rachel :-) – Ute Jul 14 '23 at 16:29
Thanks, @Ute. :-) – Rachel Altman Jul 14 '23 at 16:54

difference between two sets of pvalues

1 Answers1