5

I have submitted a paper to a journal reporting a non-significant finding. One of the reviewers has asked me to include a power analysis on my data to work out how big a sample size I would need to adequately power a study to test a raw mean difference of 1500 for significance. Basically he wants me to "prove" my study was not underpowered. Could someone offer me advice on how to do this?

I ran an unpaired t-test where:

  • Group A: (n = 6) mean = 5424; SD = 923
  • Group B: (n = 8) mean = 4734; SD = 702
DrNeuro
  • 53
  • 1
  • 4
  • 1
    Are the 2nd numbers SDs or SEs? What would you want to be able to detect? If the idea is the effect that you actually see in your data, you are underpowered by definition (that's why it's non-significant). It's possible there is some other effect that you were sufficiently powered to detect, but just didn't find. However, to determine that, the 'other effect' needs to be specified. – gung - Reinstate Monica Feb 02 '16 at 22:22
  • Do you want to say that the SDs are equal (using the pooled SD), or do you want the N required to power the Welch t-test? – gung - Reinstate Monica Feb 02 '16 at 22:33
  • I would like to use the pooled SD. Could you talk me through the working so I can apply it to my other experiments? I appreciate your time – DrNeuro Feb 02 '16 at 22:37
  • Sure, it'll be a bit before I can get to this, though. – gung - Reinstate Monica Feb 02 '16 at 22:42
  • No problem. I'll pop back online tomorrow. Thank you – DrNeuro Feb 02 '16 at 22:43
  • 3
    Just say no. With a nonsignificant finding, a post hoc power based on the observed effect size will ALWAYS yield a low power. It is circular logic and an empty exercise. See e.g. Hoenig and Heisey, "The Abuse of Power", 1981, The Anerican Statistician. – Russ Lenth Feb 03 '16 at 00:29
  • 1
    @rvl, a mean difference of 1500 isn't the observed effect size, though. Note that the observed mean difference is 592. I think this isn't as mindless as the typical situation you are referring to. – gung - Reinstate Monica Feb 03 '16 at 03:59
  • Ok @gung, good point. I hadn't read the question completely. – Russ Lenth Feb 03 '16 at 04:15

2 Answers2

7

Power analyses exploit an equation with four variables ($\alpha$, power, $N$, and the effect size). When you solve for power by stipulating the others, it is called "post hoc" power analysis. People often use post hoc power analysis to determine the power they had to detect the effect observed in their study after finding a non-significant result, and use the low power to justify why their result was non-significant and their theory might still be right. As @rvl points out in the comments, this involves "circular logic and [is] an empty exercise". However, that is not what you are doing here. Moreover, 'post hoc' power analysis can be a legitimate exercise: for example, I have had cases where a researcher knew they would only be able to get a certain number of patients with a rare disease and wanted to know the power they would be able to achieve to detect a given clinically significant effect. Although that isn't 'post hoc' in the sense of after the fact, it is called "post hoc" power analysis because it solves for power as a function of the other three.

I will go out on a limb and assume your $\alpha$ was $.05$. Clearly, $N = 9$. We can determine the effect size by calculating the pooled SD, and then the standardized mean difference that corresponds to a raw mean difference of $1500$ and the computed pooled SD.

\begin{align} SD_\text{pooled} &= \sqrt{\frac{(n_1-1)s^2_1 + (n_2-1)s^2_2}{(n_1+n_2)-2}} \\[10pt] 2145.041 &= \sqrt{\frac{(5-1)1930^2 + (4-1)2402^2}{(5+4)-2}} \\[30pt] ES &= \frac{\text{mean difference}}{SD_\text{pooled}} \\[10pt] 0.70 &= \frac{1500}{2145.041} \\ \end{align}

Having determined the effect size you want to use, you need some software to do the power analysis calculation for you. (It involves numerical approximations that you cannot do by hand.) A free and convenient application is G*Power.

enter image description here

The power to detect a standardized mean difference of $0.70$ with $\alpha = .05$ using a two-tailed $t$-test when $N = 9$ is $\approx 15\%$. I would say your study is underpowered.

  • (+1) Not sure there's anything specifically post-hoc about solving for power by stipulating the other variables. What's post-hoc in this case is using the variance estimate from the sample in the determination of effect size. – Scortchi - Reinstate Monica Feb 03 '16 at 11:44
  • 1
    Thanks for clarifying. (My understanding of the terminology is a little different: "power analysis" is the general term for exploitation of the equation, regardless of whether you're calculating the power for a given sample size &c., or the sample size for a given power &.c; while "post hoc power analysis" uses information not available prior to analysis of the experimental data - the variance estimate as per your example, or (as you quite rightly frown upon) the observed effect size.) – Scortchi - Reinstate Monica Feb 03 '16 at 14:17
  • I don't know @Scortchi. Unfortunately, terminology gets used in very different ways. I've never heard that scheme before, but I don't doubt it's used. FWIW, G*Power uses the same scheme / terms I do, where the names correspond to what you're solving for. – gung - Reinstate Monica Feb 03 '16 at 14:46
  • After your edit I didn't suppose you didn't mean it, but I just wanted to leave an explanation of the other usage here for reference. Wikipedia goes further than me by saying that post-hoc analyses are just those that use observed effect size. – Scortchi - Reinstate Monica Feb 03 '16 at 14:53
  • I apologize if my comment came off poorly, @Scortchi. I certainly didn't doubt your sincerity. The Wikipedia page is interesting, it conflates prior with N, & subsequent with power. For me those are distinct. For example, I do lots of "a-priori" power analyses for people, and almost always they bring in a vaguely analogous paper with some observed effect and ask me to use that. I always talk about the effect they would care about detecting, but people can't think in those terms; they want to use a previously observed effect. – gung - Reinstate Monica Feb 03 '16 at 15:22
  • So by Wikipedia's definition, should that be "post hoc" (because it uses an observed effect) or "a-priori" (because it is solving for N)? I don't know. – gung - Reinstate Monica Feb 03 '16 at 15:24
  • Sorry, it didn't come off poorly at all! I was just trying to say that though I initially thought it might be a slip, I knew that because you kept it you must have had a good reason, as indeed you do. Anyway, though I don't know what Wikipedia would say about that situation, I'd say that being post hoc or a priori isn't an intrinsic property of a power analysis, but relative to a given experiment: if it uses the results from experiment $X$ it's post hoc with respect to $X$ but can still be a priori with respect to another experiment $Y$. – Scortchi - Reinstate Monica Feb 03 '16 at 16:11
4

I can't believe people are still asking for post-hoc power analyses!

Please, do not include this in your paper. The post-hoc power analysis is not going to tell you anything, and people reading your paper will think that you do not know what you are doing!

Power analyses can only be performed before you collect your data. They are very useful for e.g. determining the number of samples you need to collect in order to observe a particular effect size. After the study, a "post hoc" analysis is useless, since both your effect and sample sizes are constants. Some people argue that they can be used to determine the required sample size of a hypothetical future study, but the utility of this is debated (since it only makes sense if the study is actually performed). But the reviewer did not ask for this, he/she asked for you to prove that your study was underpowered, something that you can not do using a post-hoc power analysis.

A quick web search gave me the following posts/papers demoting post-hoc power. Please read them and refer your reviewer to them. Clearly, he/she does not understand what he/she is asking you to do.

References:

Sorry for yelling ;-P

Stephan Kolassa
  • 123,354
Tommy L
  • 1,553
  • I totally agree Tommy! However it is a difficult situation when the reviewer won't back down and it's the only thing standing between you and a publication. Very frustrating! – DrNeuro Feb 03 '16 at 09:16
  • Yes, I understand.. Perhaps the editor is reasonable? Or you could make some vague statement like the one I mentioned, that in hypothetical future studies, should a particular sample size and effect size be sought, the power of that study would be this. – Tommy L Feb 03 '16 at 10:00
  • 1
    Yes, I'll do that. I'll make a small statement about a hypothetical future study but I'll also make an arguement to the editor about posthoc power analysis being futile. Thank you – DrNeuro Feb 03 '16 at 10:07
  • Sounds perfect! ;-) Good luck with your paper! – Tommy L Feb 03 '16 at 10:10
  • 2
    Good points - but note the mean difference of 1500 for which power is to be calculated isn't what was observed, so this isn't post-hoc power analysis as usually understood. (Confidence intervals around the distance estimate might be a more natural way of addressing the same concern.) Indeed, O'Keefe writes "[...] where after-the-fact power analyses are based on population effect sizes of independent interest (as opposed to a population effect size exactly equal to whatever happened to be found in the sample at hand), they can potentially be useful." & goes on to explain why. – Scortchi - Reinstate Monica Feb 03 '16 at 11:20
  • 1