Power simulation (a priori) for mixed models when no prior data is available

Question

The recommended way to do a priori power calculations for linear mixed models is to simulate data and then estimate power using the simr package, for example.

All these approaches require you to specify detailed parameters of the model you assume. These can sometimes be estimated from prior data. However, what would you recommend if no prior data can be obtained, for example in novel research areas?

I am currently planning a study with 6 observations per participant and I would like to know how many participants I need. I am intending to compare two nested models, one including two more parameters than the other.

I will do post-hoc power analysis to determine how large of an effect I would have been able to find with the number of participants I had. However, I need a reasonable way to justify my number of participants beforehand.

Robert Long · Accepted Answer · 2020-11-09T12:07:01.430

5

Although you are conducting novel research you should still be able to come up with plausible range of what the parameter values should be. So, based on these ranges, you would then conduct a number of simulations, where you systematically use values from within these ranges. You will then arrive at results where each combination of parameter values gives a minimum number of clusters. From there you can either take a conservative approach and use the maximum of these, or some other approach such as the median.

edited Nov 09 '20 at 12:07

answered Nov 09 '20 at 12:03

Robert Long

60,630

2

+1. An alternative way of putting this would be: "If you have no idea whatsoever what effect sizes are realistically possible, then what are you doing running this study?" – Stephan Kolassa Nov 09 '20 at 12:05
... also, I have seen the recommendation of using an effect size you would be sorry to miss. This sounds like reasonable advice to me. – Stephan Kolassa Nov 09 '20 at 12:06
So instead of only looping through different values of effect size, I would also loop through different magnitudes for the other parameters, such as sigma and variances? – Max J. Nov 09 '20 at 12:07
@StephanKolassa thanks :D I was actually thinking along those lines, but wanted to be a bit more diplomatic ! And in your 2nd comment, that is also very good advice. Of course, in mixed models, we also need to consider the parameters for the random effects. – Robert Long Nov 09 '20 at 12:08
@MaxJ. yes that is correct. – Robert Long Nov 09 '20 at 12:08
@RobertLong Thank you! – Max J. Nov 09 '20 at 12:12
@RobertLong Would it be meaningful to additionally conduct a post-hoc power analysis using the actual parameters I will find for my data? – Max J. Nov 09 '20 at 12:28
1

That is a whole other can of worms. It depends what you are going to do with it. This is a good thread on the topic: https://stats.stackexchange.com/questions/71387/when-if-ever-is-it-a-good-idea-to-do-a-post-hoc-power-analysis – Robert Long Nov 09 '20 at 12:34
After running different simulations, I find that both the fixed effect of the term I am interested in and sigma affect the power of my model. For the fixed effect, I can come up with a reasonable value space. However, what do I do with sigma? I found cases online, where people simply used sigma = 1 or sigma = 2. Sigma obviously affects the effect size f² of the relevant parameter. I am a bit stuck on how to interpret my simulation results if they are depending on sigma. – Max J. Nov 11 '20 at 10:44
What is "sigma" ? – Robert Long Nov 11 '20 at 10:47
Sigma is the residual standard deviation I specify for my simulated model. See here: https://cran.r-project.org/web/packages/simr/vignettes/fromscratch.html In this example they specify it as s = 1 (residual standard deviation) – Max J. Nov 11 '20 at 10:50
Ahh, well I think that's a bit misleading because it will depend on the scale of the observed variables. It's much more important to think in terms of the ratio between the residual variance and the variance of the random effects - the intra-class correlation (in a simple model) – Robert Long Nov 11 '20 at 11:08
I am not sure I can follow. I haven't found a way in simr to compute the f² for the relevant effect when using the fcompare test in the powerSim function. I think this would solve my confusion as the f² would depend both on the specified residual sd and the specified fixed effect, right? Is there a way to do this? I scaled all my predictors, if this is of any help. My VarCorr (should be equivalent to the icc?) is set to 0.1 – Max J. Nov 11 '20 at 11:14
I think you need to ask a new question about that, by please try to keep it statistical, rather than about software otherwise it may be closed as off-topic. I don't know what fcompare test in the powerSim function is, but if I were you I would do all of this by hand by simulating the data myself, that way I would have control over everything. Start with a simple linear model, and then move onto a mixed model. – Robert Long Nov 11 '20 at 11:19
I understand. I will try it with a software-related question. Thank you for your help, Robert! – Max J. Nov 11 '20 at 11:22

Power simulation (a priori) for mixed models when no prior data is available

1 Answers1