Comparing power between models: how to set a fixed effect size in glm?

Question

I am trying to understand how much power we lose by selecting a categorical exposure versus a continuous exposure in a generalised linear model (GLM).

I didn't have problems calculating power using the effect size from the original dataset. We did that using SIMR package.

For model 1 (continuous exposure)

model1b       <- glm(y ~ x, family=binomial(link='logit'), data=data)
power_model1b <- powerSim(model1b, nsim=1000, seed=30, test=fixed("x"))

For model 2 (categorical exposure)

model2b              <- glm(y ~ categorical_x, family=binomial, data=data)
power_model2b_low    <- powerSim(model2b, nsim=1000, seed=30, 
                                 test=fixed("categorical_xlow"))
power_model2b_medium <- powerSim(model2b, nsim=1000, seed=30, 
                                 test=fixed("categorical_xmedium"))

We saw that the power decreases when we use categorical measures. However, I read that using effect size from the data already collected might lead to bias. Therefore, a solution for that is to set a fixed effect size and do the comparison.

I am confused about how to address this because model 1 is using continuous exposure, and model 2 is categorical exposure. Would it be okay if I just change the coefficients using coef option in SIMR and run the calculations again to compare power between models?

model_fixed1b          <- model1b
coef(model_fixed1b)[2] <- -0.22    # changing coef.
model_fixed2b          <- model2b
coef(model_fixed2b)[2] <- -0.22    # changing coef for level 1 categorical exposure
coef(model_fixed2b)[3] <- -0.22    # changing coef for level 2 categorical exposure

score 3 · Answer 1 · answered Jul 07 '22 at 20:40

Don't use a canned simulation function to conduct your power analysis—just do your own simulation from scratch. Your situation isn't that difficult. There are various resources on CV to assist you with simulating a logistic regression model (in particular, see: Simulation of logistic regression power analysis - designed experiments).

The background information in your question is rather sparse, but I gather you think there is a linear (in log odds) relationship between a continuous $X$ and a binary $Y$. You want to plan a subsequent study based on the same relationship found in your data, but wonder what would happen if you dichotomized $X$. So just simulate that, but in one case leave $X$ as is, and in the other case dichotomize it. Then compare the observed power. Here is a simple example (coded in R):

set.seed(7309)                     # this makes the example exactly reproducible
x  = runif(30, min=0, max=2)       # these are our original x-values
lo = 0 + 1*x                       # I simulate the true underlying log odds
p  = exp(lo)/(1+exp(lo))           # I convert them to probabilities  
y  = rbinom(30, size=1, prob=p)    # I generate the observed, binomial y-values
m  = glm(y~x, family=binomial)     # I'm pretending this is your original model
B0 = coef(m)[1]                    # here I extract the fitted coefficients  
B1 = coef(m)[2]
lo = B0 + B1*x                     # which I use to create a new set of log odds...
p  = exp(lo)/(1+exp(lo))           # ... and probabilities
x.cat = ifelse(x>median(x), 1, 0)  # here I dichotomize your x-values
p.vect.cont = vector(length=1000)  # these will store the p-values
p.vect.cat  = vector(length=1000)
for(i in 1:1000){
  y      = rbinom(30, size=1, prob=p)          # I simulate new observed y's each time, 
  m.cont = glm(y~x,     family=binomial)       # refit the models, 
  m.cat  = glm(y~x.cat, family=binomial)
  p.vect.cont[i] = coef(summary(m.cont))[2,4]  # and get the p-values
  p.vect.cat[i]  = coef(summary(m.cat))[2,4]
}
mean(p.vect.cont<.05)  # [1] 0.389             # these are empirical power estimates 
mean(p.vect.cat<.05)   # [1] 0.21

Comparing power between models: how to set a fixed effect size in glm?

1 Answers1