I am using a ranksum test to compare the median of two samples ($n=120000$) and have found that they are significantly different with: p = 1.12E-207. Should I be suspicious of such a small $p$-value or should I attribute it to the high statistical power associated with having a very large sample? Is there any such thing as a suspiciously low $p$-value?
-
1This is almost a duplicate of https://stats.stackexchange.com/questions/78839. – amoeba May 09 '18 at 19:53
3 Answers
P-values on standard computers (using IEEE double precision floats) can get as low as approximately $10^{-308}$. These can be legitimately correct calculations when effect sizes are large and/or standard errors are low. Your value, if computed with a T or normal distribution, corresponds to an effect size of about 31 standard errors. Remembering that standard errors usually scale with the reciprocal square root of $n$, that reflects a difference of less than 0.09 standard deviations (assuming all samples are independent). In most applications, there would be nothing suspicious or unusual about such a difference.
Interpreting such p-values is another matter. Viewing a number as small as $10^{-207}$ or even $10^{-10}$ as a probability is exceeding the bounds of reason, given all the ways in which reality is likely to deviate from the probability model that underpins this p-value calculation. A good choice is to report the p-value as being less than the smallest threshold you feel the model can reasonably support: often between $0.01$ and $0.0001$.
- 322,774
-
14When I reported ''$p<10^{-26}$'' in a conference paper, a reviewer told me that I should change it to ''$p<0.001$'' in order to follow APA guidelines. – Avery Richardson Jun 10 '11 at 23:15
-
4
-
2(+1) At some point it's more likely that the government is nefariously flipping bits in your RAM remotely with super spy technology... – JMS Jun 11 '11 at 04:56
-
5(+1) You can actually get down to just below $5 \times 10^{-324}$ in IEEE double precision floating point. But, your numerical routines for calculating $p$-values are almost guaranteed to fall apart before then. Unless you know for a fact that your modeling assumptions are perfectly correct (and when are they?), a $p$-value eventually just becomes a measure of the sample size once the sample gets large enough. – cardinal Jun 11 '11 at 15:18
-
1@Cardinal we're both wrong about the limits: apart from denormalized values, the smallest IEEE double is approximately $10^{-308}$, corresponding to ten bits for a base-2 exponent. – whuber Jun 11 '11 at 16:41
-
1@whuber: I was listing the smallest positive representable number, including subnormal values: $2^{-1074}$. :) – cardinal Jun 11 '11 at 16:58
-
1@ThomasLevine: isn't that just losing information? Not much information, to be sure, but it's also not saving any space, or making it easier to read. Sounds like a pointless convention... – naught101 Apr 20 '12 at 01:46
-
2I agree with @naught: especially for those trying to reproduce or check a paper, having a reasonably precise value for comparison is helpful. Changing "$\lt 10^{-26}$" to "$\lt 0.001$" erases 23 significant digits! Evidently, the APA guidelines focus on reporting results rather than on checking them -- to the detriment of all science. – whuber Aug 03 '21 at 15:06
-
There are many assumptions required for the correctness of extremely low p-values, not the least of which are the correctness of the chosen data model and how the actual experimental design used fits in with the p-value calculations. – Frank Harrell Sep 24 '23 at 12:27
-
@Frank I suspect you might be confounding correctness with interpretation. All p-values are computed from hypothetical models, usually expected to be counter-factual. – whuber Sep 24 '23 at 15:00
-
You're right, but the fact that the models are not exactly right means that for the purpose at hand it is pretty silly to compute p-values to such small values. – Frank Harrell Sep 24 '23 at 15:31
-
1@Frank Usually, yes; but not always. The precision can be useful for readers who wish to make Bonferroni or other corrections, as well as readers attempting to reproduce the results. Re "pretty silly:" I have seen more than one expert report produced in litigation that not only gleefully quoted astronomically small p-values (a striking one was around $10^{-80},$ quoted for a Fisher Exact Test by an Ivy League sociologist), they made a point of writing out all the zeros in decimal notation. It impressed the judge. :-( – whuber Sep 24 '23 at 16:32
There is nothing suspicious -- extremely low p-values like yours are pretty common when sample sizes are large (as yours is for comparing medians). As whuber mentioned, normally such p-values are reported as being less than some threshold (e.g. <0.001).
One thing to be careful about is that p-values only tells you whether the difference in median is is statistically significant. Whether the difference is significant enough in magnitude is something you will have to decide: e.g. for large sample sets, extremely small differences in means/medians can be statistically significant, but it might not mean very much.
- 2,188
A p-value can achieve a value of 0.
Suppose I am testing the composite hypothesis about the value of a range of a uniform 0, $\theta$ random variable. If I set $\mathcal{H}_0: \theta = 1$ and sample a value of $X=1.1$, you see it's impossible to observe such a value or higher under the null hypothesis. The p-value is 0.
- 62,637