1

I've seen on some plots e.g. with values for the statistical of significance p = 0.7 and for the effect size of g = 0.5.

Is it possible to say there is a certain effect (size) when it is not significant?

Ben
  • 3,443
  • Thanks but I'm not asking for the meaning of the effect size but moreover whether it makes sense despite/along(?) the significance? I would say it doesn't, or? – Ben Sep 22 '22 at 07:25
  • 2
    For clarity, it is often reasonable to refer to the estimated effect in addition whether or not it is statistically significant. For example: the estimated increase in yield of x is economically significant but not statistically significant. Or: the estimated improvement of y is both medically and statistically significant. So do we need anything more I guess is your question. – Graham Bornholt Sep 22 '22 at 08:28
  • thanks, that's an interesting input. I always thought the effect size represents like how much it matters but yeah, sometimes even small changes matter. Right? – Ben Sep 22 '22 at 09:48

2 Answers2

3

The finding that an effect is non-significant has a very limited meaning. For example, it depends on an arbitrary cut-off point (10%, 5%, 1%, why not 6.2968730124564859%?). More can be found in the ASA Statement on p-Values: https://amstat.tandfonline.com/doi/pdf/10.1080/00031305.2016.1154108 . But in short, you can, and should, talk about an effect sizes for non-significant effects.

Maarten Buis
  • 21,005
  • 1
    p=0.7 is far larger than any conventional (or reasonable) cutoff, so the data are quite clearly compatible with no effect, regardless of any controversy about arbitrary cutoff values (if not quite totally regardless of model assumptions). – Christian Hennig Sep 22 '22 at 09:53
  • "But in short, you can, and should, talk about an effect sizes for non-significant effects." - Feels like this asks for some further explanation. I can take two finite samples from the same population and estimate the effect size, which would be non zero due to sampling. I can talk about this (likely non-significant) effect size, but what is there to say? – Karolis Koncevičius Sep 22 '22 at 10:14
  • @ChristianHennig "quite clearly compatible with no effect" Absence of evidence is not evidence of absence. All the test says is that we could not find evidence against the hypothesis of no effect. That is something very different from saying there is no effect, and we thus should not talk about it, like the OP is suggesting. That p-value is a matter of concern, but getting in position where we have a table of results, where some are edited out with a note "not significant" is a lot worse. – Maarten Buis Sep 22 '22 at 10:57
  • 1
    @MaartenBuis "quite clearly compatible with no effect" does not claim at all that there is evidence of absence; in fact compatibility statements like this are perfectly in line with the ASA statement cited by you. – Christian Hennig Sep 22 '22 at 12:30
1

The estimated effect size is, if you want, the "best guess" that we can have from the data, assuming the model is fine and the estimation procedure valid (which I also assume in the following).

$p=0.7$ however means that in case that there is no effect it is quite likely to have data that show an estimated effect size as big as the one you observed or even bigger. This means that the data might well have arisen from a model with zero effect, or in other words, there is no evidence that the effect is nonzero. Obviously this doesn't mean that there is no effect, but it does mean that the data could not distinguish your estimated effect from zero or even from an effect in the opposite direction.

You could compute a confidence interval to see a range of parameters compatible with the data that gives you some indication about the uncertainty of your estimate. (Obviously for this you'd need to decide the confidence level, but conventional ones such as 95% or 99% are reasonable.)

So any communication of the estimated effect size should be accompanied by an indication about how uncertain it actually is. Just saying "there is an effect of 0.5" is misleading, because it may well be very different or even non-existing.

  • "No evidence" sounds a little too strong. For instance, if the p-value were 0.07 instead of 0.7, would you still say "no evidence"? Shouldn't the amount of evidence be considered a continuum rather than a binary "there is or there isn't" quantity? – whuber Sep 22 '22 at 12:07
  • 1
    @whuber I was specifically talking about p=0.7 as was given in the question. With p=0.7 in my view there is no evidence, as events with probability 0.7 happen all the time. With p=0.07 it's more subtle. I am not committed to "there is or there isn't" in general, but with p=0.7 there isn't. – Christian Hennig Sep 22 '22 at 12:29
  • Of course you were using what's in the question. My point is, where is the threshold between "no evidence" and "evidence"?? You might try $p=0.5,$ I suppose. It would be an interesting argument. I can imagine a Bayesian demonstrating that with certain priors, almost any value of $p$ less than $1.0$ would be some amount of evidence. – whuber Sep 22 '22 at 14:57
  • 1
    @whuber Personally I'd say as long as it's >=0.2 "this kind of thing happens all the time" still applies, maybe even 0.1. The thing is, thresholds are ultimately always arbitrary, but language is discrete by nature, so thresholds are always implied if such numbers are to be interpreted in language. There is no way around this (neither for Bayesians by the way). – Christian Hennig Sep 22 '22 at 16:07
  • "No evidence", "no effect", "no difference" should be avoided when discussing test outcomes since such statements are usually contradicted by the accompanying point estimate of the effect. It is better to report the point estimate, report the p-value and report, for example, that the evidence for a nonzero effect is not statistically significant (at the x% level). – Graham Bornholt Feb 25 '24 at 21:54
  • @GrahamBornholt "No evidence" is very different from "no effect" or "no difference". I'd agree about the latter two but not the former one. If you observe a result that is perfectly realistic and normal assuming a certain model, I don't see how that result holds any evidence against that model.That doesn't mean that the model is true (i.e., that there is really no effect). – Christian Hennig Feb 25 '24 at 22:53
  • I could almost agree with the statement that there is no evidence against the null but to say there is no evidence of a nonzero effect seems a step too far when the point estimate is nonzero. Maybe I'm splitting hairs. – Graham Bornholt Feb 26 '24 at 00:39
  • @GrahamBornholt How is "no evidence against the null" different from "no evidence of a nonzero effect"? The null is that the effect is zero. – Christian Hennig Feb 26 '24 at 10:55
  • Ok, maybe this will clarify the issue. If the null is that the population mean is nonpositive, and the sample produces a p-value of 0.4 and a sample mean of 3.1, would you say that there is no evidence of a positive mean? (I'm trying to inderstand whether your earlier 'no evidence' comment was the result of the point null hypothesis in that case). – Graham Bornholt Feb 27 '24 at 01:55
  • @GrahamBornholt I wouldn't think it's wrong to say that. With p=0.4 the mean could basically be as much south as north of zero. 3.1 as absolute number doesn't say anything, it has to be interpreted in relation to the variation, and if the variation is big enough to give p=0.4 here, we hardly know anything about the sign. (If you now say that there's a tiny little bit better chance that the mean is $>0$ than $\le 0$, I wouldn't object either. But basically p=0.4 means we hardly know anything.) – Christian Hennig Feb 27 '24 at 17:41
  • I agree with what you are saying about the implications of p-values for the hypotheses, and that "we hardly know anything about the sign". Let's leave it at that before we are sent to the chat room. – Graham Bornholt Feb 27 '24 at 18:38