I do backward elimination, by iteratively removing the biggest p-values until the biggest p-value is < 0.157.
Then, I have a model, which confidence intervals displayed are not wide enough: "The method yields confidence intervals for effects and predicted values that are falsely narrow; see Altman and Andersen (1989).".
To determine appropriate confidence intervals, I repeat the experience by bootstrapping*. Is it correct to compute the confidence intervals that way:
- I store the values of a given parameter when it's selected** (let's say it's selected 70,041 times out of 100,000 bootstrap iterations);
- I sort that list, and take the smallest parameter value such that at least 5% of the coefficients are below it - same for the 95% upper bound;
- I display this as being my confidence interval around the value I had without bootstrapping*** (i.e. doing the backward elimination on the dataset not bootstrapped).
*Are there situations where this is not valid?
**Why should I store the values of a given parameter when it's selected and not when it's not, it seems like i'm biasing something there.
***That should be the same value as if I took the mean of this list (but the mean of the list would be less precise because I only did 100,000 bootstrap iterations), right?
Sources (scientific articles) are welcome.