With all the concern about reproducibility, I have not seen a very basic question answered. Using the standard hypothesis testing approach, if one experiment results in p<0.05, what is the chance that a repeat experiment will also result in p<0.05? I've seen a related problem approached by Goodman (1) and others, starting with a particular p-value for the first experiment, but I have not seen it more generally as I stated the problem.
So my question here is if the approach below has already been published somewhere.
Let’s make pretty standard decisions that alpha = 0.05 and power = 0.80. We also need to define the scientific context of the experimentation. Let's say we are in a situation where you expect half the hypotheses tested to be true and half are not. In other words the probability of the null hypothesis is 0.50, which we'll call pNull.
Let's compute the results of 1000 (arbitrary, of course) first experiments.
- Number of experiments where the null H is actually true = 1000 * pNull = 500.
- Number of these expected to result in p<alpha = 500 * alpha = 25 experiments.
- Number of experiments where the alternative H is actually true = (1 - pNull)*1000 = 500
- Number of these expected to result in p<alpha = 500 * power = 400
- Total experiments expected to result in p<alpha = 25 + 400 = 425
Now on to the second experiment. We only run the second experiment for cases where the first experiment resulted in p<alpha.
- Of the 25 experiments (where null is actually true), how many of the second experiments are expected to result in p<alpha? 25 * alpha = 1.25
- Of the 400 experiments (where the alternative is true), how many of the repeat experiments are expected to result in p<alpha? 400 * power = 320
- Number of second experiments expected to result in p<alpha = 1.25 + 320 = 321.25
Given that the first experiment resulted in p<alpha, the chance that a second identical experiment will also result in p<alpha = 321.25/425 = 0.756
This assumes you set alpha = 0.05 and power = 0.80, and the scientific situation is such that pNull = 0.50. I like to think things out verbally, but of course, this can all be compressed into equations. But my question is if this straightforward approach has already been published.
- Goodman, S. N., 1992, A comment on replication, P-values and evidence: Statistics in Medicine, v. 11, no. 7, p. 875–879, doi:10.1002/sim.4780110705.