I am running an XGBoost model to predict the global economic cost of invasive species. My training set is only about 3000 data points.
I am bootstrapping my predictions, and went with the default 1000 samples. I see other questions in which answers recommend at least 1000 samples (Rule of thumb for number of bootstrap samples).
My problem is that the hyperparameter grid search is going too slow, and the whole thing is estimated to take a month to run. I don't have access to a bigger computer now to increase parallelization. If a co-author or reviewer tells me to change anything in the model it will take another month to run again.
So far I have run 43 samples, and calculated the mean R-squared for increasing number of samples (plotted bellow). I see the mean R-squared is close to stabilizing at 43. Could I get away with only 50 or 100 bootstraps if I show the R-squared doesn't change much after this, or are there other reasons I should run it 1000 times?
