1

I have a silly question on bootstrap methods for calculating p-value.

Suppose I have a dataset, which is likely normal distributed. Now I want to determine whether a particular value X is signifiant high in the dataset by giving a pvalue.

For me, there are two options

  1. since the dataset is roughly normal distributed, I could calculate the sd and mean, get the area for z>(X-mean)/sd from Z table, and use it as pvalue.

  2. by bootstrap. I could generate a new sample S (length N) from the original dataset by bootstrap, then get the p-value by sum(S>X)/ N.

My question is am I right to use bootstrap in 2nd option? Could you recommend books for better understanding bootstrap and permutation test?

Thanks.

ccshao
  • 667
  • what do you mean by "significant high" ? Do you want to check whether X is an outlier ? – steffen Oct 23 '12 at 11:11
  • No, I want to give pvalue for data which is equal or higher than X. – ccshao Oct 23 '12 at 11:42
  • 2
    Unless I misread it, the question sounds like an effort to detect outliers. The bootstrap doesn't apply here and, besides, it's fruitless. Even the maximum value in the dataset is going to occur in $(1 - ((n-1)/n)^n)\approx 1-1/e\approx 0.63$ of the bootstrap samples, which is far too high to give you any chance at achieving a low p value. For the books, see http://stats.stackexchange.com/questions/5845, http://stats.stackexchange.com/questions/15692, and http://stats.stackexchange.com/questions/25151. – whuber Oct 23 '12 at 15:19

0 Answers0