I have data frame named "finalend", I conducted multiple logistic regression model named "model" with all predictors as categorical or binomial variables and "TestAnxiety" as outcome variable, I was asked to calculate power, but I do not know how to do it, after reading some posts here and using copilot:
1- I think I should use simulation
2- The intuition I got is doing simulation including logistic regression model with the same data, but changing only outcome variable "TestAnxiety" for each observation randomly with each iteration based on the probability calculated from the regression model "model" I have done before, I then calculated power for each coefficient
3- I want to know if that is the correct approach and if there is anything wrong in my code
here is my code in r:
> # Set parameters
> n <- nrow(finalend) # Sample size for each simulation
> B <- 1000 # Number of simulations
>
> # Get the estimated coefficients from your fitted model
> betas <- coef(model)
>
> # Initialize power
> power <- rep(0, length(betas)-1)
>
> # Convert factor variables in finalend into dummy variables
> finalend_dummy <- model.matrix(~. - 1, data = finalend)
>
> # Loop over simulations
> for (b in 1:B) {
+
+ # Subset the dataframe using the names of the coefficients
+ predictors <- finalend_dummy[, names(betas)[-1]]
+
+ # Calculate the log-odds and probability for each observation
+ logit <- betas[1] + predictors %*% betas[-1]
+ prob <- exp(logit) / (1 + exp(logit))
+
+ # Generate random outcomes based on these probabilities
+ TestAnxiety <- rbinom(n, 1, prob)
+
+ # Combine TestAnxiety and predictors into a data frame
+ data <- as.data.frame(cbind(TestAnxiety, predictors))
+
+ # Fit the logistic regression model to the generated data
+ sim_model <- glm(TestAnxiety ~ ., family = binomial(), data = data)
+
+ # Check if coefficients are significant
+ for (i in 2:length(betas)) {
+ if (summary(sim_model)$coefficients[i, 4] < 0.05) {
+ power[i-1] <- power[i-1] + 1
+ }
+ }
+ }
>
> # Calculate power
> power <- power / B
>
> # Print power
> print(power)
[1] 0.083 0.057 0.044 0.204 0.073 0.244 0.101 0.101 0.064 0.267 0.114 0.109 0.144 0.601 0.859 0.100
[17] 0.286 0.147 0.106 0.082 0.998 1.000 0.259 0.329 0.059 0.701 0.479 0.295 0.995 0.094 0.770 0.370
[33] 0.653 0.259 0.134 0.600 0.082 0.307 0.877 0.351 0.460 0.855 0.855
Edit: I think calculating post hoc power is not a good idea, but instead I would calculate sample size for a follow up study, how I edit this code to calculate it?