1

I'm performing a mediation analysis in R, block-bootstrapping by participant instead of by row since I have multiple observations for each participant. My code looks like this:

# Create function to block bootstrap by participant

indirectsaved = function(formula2, formula3, dataset, ids) {

Get unique ids

unique_ids <- unique(ids)

Shuffle ids with replacement

shuffled_ids <- sample(unique_ids, replace=TRUE)

Create a new dataframe by appending rows for each shuffled id

d <- do.call(rbind, lapply(shuffled_ids, function(id) { dataset[dataset$id == id, ] }))

model2 = lm(formula2, data = d) model3 = lm(formula3, data = d) a = coef(model2)[2] b = coef(model3)[3] c = coef(model3)[2] #direct effect indirect = a*b total = c + indirect

return indirect, total, and direct effects

c(indirect, total, c)

}

Calculate bootstrapped results

bootresults = boot(data = df, statistic = indirectsaved, formula2 = mediator ~ type, formula3 = DV ~ type + mediator, R = 10000)

Results

bootresults boot.ci(bootresults, conf = .95, type = "norm")

Calculate prop mediated

prop_mediated <- bootresults$t[,1] / bootresults$t[,2] mean(prop_mediated)

Calculate p-value

p_value = sum(bootresults$t[,1] <= 0) / length(bootresults$t[,1]) p_value

As far as I know, this method of performing a mediation analysis and obtaining a p-value from it are both correct. However, I recently obtained a result where the indirect effect's CI was (1.05, -0.02) yet the p-value was still 0 (indicating <0.001% of indirect effects should be less than 0). What could possibly produce this result?

  • Well, you are using the normal approximation when you derive the CI ... – Roland Aug 15 '23 at 06:19
  • The way you calculate a Bootstrap p value seems to be completely wrong. A bootstrap p value requires the existence and application of a pivotal quantity (to let the data distribution mimic the null hypothesis distribution). I doubt that it is easily available in your case. – Michael M Sep 12 '23 at 13:46

1 Answers1

3

There are a two things that come to mind why you would find discrepancies between the bootstrap p-value and the bootstrap confidence interval.

a) The computation you are using for your p-value is one-sided, i.e. you check the proportion of indirect effects in your bootstrap samples that were equal to or smaller than zero (-> you expect a positive indirect effect). Assuming you are testing at an alpha level of 5%, the confidence interval is two-sided (2.5% at the lower and 2.5% at the upper level of your empirical sampling distribution) and you simply assess if your effect does not equal zero. If you wanted to do a one-sided test via the CI, you would have to select a 90% CI in boot.ci, which leaves (the lowest and) highest 5% on either side of the empirical distribution. If the CI then does not contain zero you would further have to consider the sign of the lower & upper level to confirm that your effect is indeed not equal and larger than zero instead of not equal and smaller than zero.

b) As @Roland already hinted at in his comment, the bootstrap confidence intervals themselves can be computed in many different ways and can, for the same data and analysis, lead to different conclusions. You selected the normal bootstrap, which has a specific p-value that is associated with it. For a more detailed explanation plus sample code see this post: Computing p-value using bootstrap with R

In the same post @AdamO says:

A comment about lack of invariance of testing: it is entirely possible to find 95% CIs not inclusive of the null yet a p > 0.05 or vice versa.

So it is entirely possible that even with all that in mind there might still be instances where the bootstrap p-value will not lead to the same conclusion as the CI.