4

I am writing a paper for a journal. I have been asked to calculate the statistical power of my study. I have zero idea about how to do that. I am an engineer and never cared about this kind of analysis, but others do. I was told to use GPower software for this, however, I don't know where to begin.

I have a study with $68$ records, files, that I have used for a test. From each of them I have $3$ different events to detect. I have a total for $521$ for event $A$ and $6318$ for event $B$. The number of events $A$ in each record is about $5 ± 1$, and for the events $B$, $80 ± 50$.

So now, how do I compute the statistical power? Where do I begin?

GGChe
  • 165
  • 8
    To calculate statistical power you need to state a null hypothesis that you want to test and state a method that you will use to test it. – Peter Flom Mar 03 '24 at 21:24
  • 4
    Just to clarify, are you meaning that you have your test results already and that the journal wants you to do a power analysis post-testing? – Graham Bornholt Mar 03 '24 at 21:49
  • 3
    Correct, they are asking me for the post analysis. – GGChe Mar 03 '24 at 22:15
  • 8
    If they want a calculation of post hoc power, you'd be doing your area of study a favor by explaining that it should not be used. Many posts and comments on site discuss the problems. e.g. see the Gelman reference in mkt's answer and the the Honig & Helsey reference in dipetkov's answer here, and and the discussion here. Searches should yield more – Glen_b Mar 03 '24 at 22:27
  • Further, see Lakens' book here, https://lakens.github.io/statistical_inferences/08-samplesizejustification.html#sec-posthocpower ... with several other references – Glen_b Mar 03 '24 at 22:30
  • I have to agree with the others here that you should not conduct a post-hoc power analysis. I provide some references in my answer which you can point the reviewers to if you get pushback. – Shawn Hemelstrand Mar 04 '24 at 00:18
  • 1
    The Wikipedia entry "Power of a Test" has a useful summary and references for why post-hoc power analyses are a bad idea. – Graham Bornholt Mar 04 '24 at 03:08
  • wow, neat and clear. The github post is really clear. as I understand, the posteriori analysis is just to confirm your initial hypothesis or experimental design? I see that it somehows assumens that you already know the sample effect? – GGChe Mar 04 '24 at 18:44
  • You choose some particular value of the parameter upon which to base the power calculation which you then use to help you select the sample size. It's only a design guide. The actual numbers in the power calculation are only meaningful if, by some fluke, your choice for the inputted parameter value was close to the true value of the parameter. – Graham Bornholt Mar 04 '24 at 21:49
  • What does the $\pm$ indicate? Standard errors? For having 68 records, that error is very large, the deviation is almost equal to the mean. Possibly using some different statistics, like median, or modelling the distribution (Pareto, or other power laws?) might be helpful for the analysis. – Sextus Empiricus Mar 05 '24 at 15:54
  • Why do you end up with an average of $5$ events A and $80$ events B, while $6318/68 \approx 93$ and $521/68 \approx 7.7$? – Sextus Empiricus Mar 05 '24 at 16:00

1 Answers1

12

Answer

As others have already noted in the comments, you should never do a post-hoc power analysis. All three of the references below can be easily cited as defenses for yourself if the reviewer doesn't accept this.

I will note however that a priori power is indeed useful if you do not have data already collected, so there is a meaningful difference between the two.

References

  • Dziak, J. J., Dierker, L. C., & Abar, B. (2020). The interpretation of statistical power after the data have been gathered. Current Psychology, 39(3), 870–877. https://doi.org/10.1007/s12144-018-0018-1
  • Lakens, D. (2022). Sample size justification. Collabra: Psychology, 8(1), 33267.
  • Zhang, Y., Hedo, R., Rivera, A., Rull, R., Richardson, S., & Tu, X. M. (2019). Post hoc power analysis: Is it an informative and meaningful analysis? General Psychiatry, 32(4), e100069. https://doi.org/10.1136/gpsych-2019-100069
  • 3
    +1. I would very much add Hoenig & Heisey, "The Abuse of Power: The Pervasive Fallacy of Power Calculations for Data Analysis" (The American Statistician, 2001), simply because it is in a "general" statistics journal, whereas the above references are all from psychology/psychiatry and may thus not be completely convincing to engineer reviewers. – Stephan Kolassa Mar 04 '24 at 06:57
  • 1
    Thanks Ill add that to my list. Good to have something more broad anyway. – Shawn Hemelstrand Mar 04 '24 at 07:40
  • 2
    Thanks everyone for the help. This is a very nice information. More literature for my backlog. Always good to have and not very old. Thanks ! – GGChe Mar 04 '24 at 18:45
  • A calculation for the power of a study is a useful information about the precision of a study similar to a confidence interval. This can still be computed a posteriori. A two one sided t-test is a bit similar to this. You do not only tell the p-value for the hypothesis of equivalence, but also the p-value for the hypothesis of difference. – Sextus Empiricus Mar 05 '24 at 12:33
  • Could you elaborate on that point? – Shawn Hemelstrand Mar 05 '24 at 12:36
  • Say that some experiment shows no significant difference (I imagine that this is the case here with the $80 \pm 50$ if that plus minus refers to a standard error), wouldn't it be interesting to also know the power of that test? While performing a power analysis to determine the ideal sample size is a bit too late, for communicating results the power of a test might be still useful to know. If two test are insignificant with the same p-value, is the power irrelevant? https://stats.stackexchange.com/a/595079/164061 – Sextus Empiricus Mar 05 '24 at 15:43
  • Instead of power a confidence interval, posterior distribution or fiducial distribution might also work. The idea is that p-value/significance shows only one side of the coin and power shows the other side (two sides that are captured at once by using intervals or distributions instead of single values). – Sextus Empiricus Mar 05 '24 at 15:50
  • 1
    @SextusEmpiricus When you say a confidence interval 'might' also work, what is the question that you are seeking to answer with a post-test power analysis? Before we can address how to do the power calculation, it is still unclear to me why we should. What extra do these power analyses provide? (In contrast, the reasons for testing and for confidence intervals are clear.) – Graham Bornholt Mar 05 '24 at 18:05
  • @SextusEmpiricus if the a priori test is based on randomness (the estimated probability of correctly rejecting the null) while the post hoc test is fixed and known (it is directly observed), then what purpose does this actually serve? Its a bit like saying you can guess the answers to an exam, all while holding the answer sheet in front of you. The results become somewhat meaningless if you completely remove the randomness element from the equation. I think the heart of your comment is in some sense explored in the Zhang article, where they test if power can be used to estimate significance. – Shawn Hemelstrand Mar 05 '24 at 23:30
  • Power analysis has two goals. (1) It helps the design of an experiment and ensure that it is sufficiently accurate for answering the research question that one is investigating. (doing this a posteriori makes indeed no sense) (2) The power analysis is also a descriptive statistic about the experiment. It is a way to expres the size of the (expected) standard error in relationship to the effects that are of interest.... – Sextus Empiricus Mar 06 '24 at 10:05
  • ... I personally prefer confidence intervals, but some people might like to know about the power instead because this it is what they typically work with. For them this information is useful to be communicated and when left out it is not weird if a reviewer is asking about it. Is some study statistically powerful or not? That's not a weird question to ask as a reader of a scientific article. – Sextus Empiricus Mar 06 '24 at 10:15
  • Of course, computing the power of the observed effect size makes little sense and for a normal distributed statistic with fixed standard deviation, there is a direct relationship between the p-value and power of the observed effect size (along with the significance level). https://stats.stackexchange.com/a/573696/164061 – Sextus Empiricus Mar 06 '24 at 10:29