I have a dataset of 12627 records (value and sd) and want to get the sum and overall uncertainty of the sum. I ran Monte Carlo analysis with 10000 simulations. I got the statistical result as below. You see the mean and 95%CI are very close to each other.
> upper mean lower
> 8092.260 7850.384 7608.509
I plotted the 10000 simulation results of sums in a histogram, which looks like this. You see the tails stretched to -20,000 and 40,000. So I have two questions
- Does this histogram have any relationship with the confidence interval? if so, why do the two ranges differ so greatly?
- is it normal my 95%CI is so close to the mean?
The raw data look like this, you can download them from this Google Drive link
catch_mean catch_std
0.0003 0.0018
0.0156 0.0356
0.0230 0.0694
0.0906 0.0999
0.1121 0.2553
0.6705 0.7395
0.0222 0.0518
0.0891 0.6350
0.0003 0.0007
0.0127 0.0437
0.0560 0.0615
0.0180 0.0411
0.0515 0.0565
0.0110 0.0380
...
(a total of 12627 records)
The R code for Monte Carlo simulation is below
library(gmodels)
library(Rmisc)
df <- read.csv('Monte carlo data.csv')
dfsim<- data.frame(sim = double())
for (j in 1:10000) {
#vector calculation, use rnorm to randomly choose 12627 numbers, and sum them up
dfsim[nrow(dfsim)+1,]<- sum(matrix(rnorm(length(df$catch_mean),df$catch_mean,df$catch_std)))
}
CI(dfsim$sim) # calculate 95%CI
plot(hist(dfallsim$sim)) #plot the histogram of simulated means
