1

I'm performing a simple test in excel using sample size $n=50$ and taking 1000 observations of an exponential distribution with $\lambda = 2$ and thus true mean is 0.5. I am using to generate each sample the following equation: $-\ln(rand())/2$

Computing a 95% confidence interval under t-distro, df=999 and $\alpha=0.025$ my t-value is 1.962, but the confidence intervals I generate all contain the true mean value of 0.50 instead of <95% of what I should be seeing given the skewed distribution.

The CI is being computed simply: mean(sample) +/- 1.962*(std deviation of the sample)

Any ideas why?

1 Answers1

0

I see three issues worth addressing.

  1. $n=50$ and “taking 1000 observations” are incompatible ideas. Do you have $50$ observations of $1000$? You will use this number is step 2...

(Do you mean that you do 1000 simulations of 50 draws from your exponential distribution?)

  1. You calculate the confidence interval using the standard error, which is related to the standard deviation but perhaps not enough to warrant such a similar name that confuses everyone who is learning statistics. There are other posts on here that discuss this in more detail, but you want to use $s/\sqrt{n}$, not just $s$. (This is the main point of my answer.)

  2. Especially if you have $1000$ observations, but probably even for $50$, the t-based confidence interval will be quite robust to the skewness, so when you use the right equation to calculate your confidence intervals, you should get about $95\%$ of the intervals containing the true mean.

Dave
  • 62,186
  • thanks this was the error I was using st deviation instead of std error of the mean which as you state correctly is sqrt(var(x)/n) – mathcomp guy Sep 16 '20 at 03:50