I am trying to compute a bootstrapped CI for the Kendall's concordances statistic and need to present the acceleration term used.
What is the acceleration term used in the boot.ci() from R's boot package when using the BCa (adjusted bootstrap percentile) method? Say with the following code:
library(boot)
library(DescTools)
Create function to compute my estimator
my.estimator = function(data, i){ KendallW(data[i, c("var1", "var2")], correct=TRUE) }
R = 1000 #number of bootstrap resamples
Get the bootstrap object
b = boot(data, my.estimator, R)
Get confidence intervals
boot.ci(b, conf = 0.95, type = c("bca"))
It is not entirely obvious from the package description which method is used to estimate the acceleration term, but I think it is the usual jackknife. If so, does the following code (taken from a previous SE post) present the correct method to estimate the BCa confidence interval manually? These two methods did not provide the same intervals.
theta_hat = KendallW(data, correct=TRUE)
n = nrow(data)
I = rep(NA, n)
for(i in 1:n){
#Remove ith data point
xnew = data[-i, ]
#Estimate theta
theta_jack = KendallW(xnew, correct=TRUE)
I[i] = (n-1)*(theta_hat - theta_jack)
}
#Estimate a
a_hat = (sum(I^3)/sum(I^2)^1.5)/6
Use this acceleration constant in own bootstrap algorithm
Desired quantiles
alpha = 0.05
u = c(alpha/2, 1-alpha/2)
B = 1000 #number of bootstrap resamples
theta_boot = rep(NA, B)
for(i in 1:B){
#Select a bootstrap sample
xnew = sample(data, length(data), replace=TRUE)
#Estimate index
theta_boot[i] = KendallW(xnew, correct=TRUE)
}
#Compute constants
z0 = qnorm(mean(theta_boot <= theta_hat))
zu = qnorm(u)
#Adjusted quantiles
u_adjusted = pnorm(z0 + (z0+zu)/(1-a_hat*(z0+zu)))
#Accelerated Bootstrap CI
quantile(theta_boot, u_adjusted)
A mock data is:
data = structure(list(var1 = structure(c(3, 1, 1, 1, 3, 0, 3, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 2, 1, 0, 2, 0, 0, 1, 1, 0, 0, 2, 1, 1, 0), label = "Variable 1", class = c("labelled", "numeric")),
var2 = structure(c(1, 0, 0, 0, 3, 0, 3, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 2, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 3, 0, 0, 0, 0, 1, 0, 2, 1, 2, 0, 0, 0), label = "Variable 2", class = c("labelled", "numeric"))),
row.names = c(NA, -50L), class = c("tbl_df", "tbl", "data.frame"))