1

I have a sample $X=(X_1,\dots,X_n)$ of $i.i.d$ Poisson variables such that $n=100,\overline{X}=8.8$.

My goal is to obtain a $80\%$ confidence interval for the parameter $\lambda=\theta$. That is, the confidence level $\gamma=0.8$.

I have looked through Confidence interval for a small number of iid Poisson, 95% Confidence interval of $\lambda$ for $X_1,...,X_n$ IID exponential with rate $\lambda$, Math. Confidence interval of Poisson-distributed r.v., Math. An exact and an approximate interval for a Poisson distribution which provide some formula derivation as well as formulae that give the numeric answers. My problem is that in all derivations there are missing steps, which I cannot restore. Below is what I have done so far:

Consider the statistic $T(X)=n\overline{X}=\sum_{i=1}^nX_i$. $T$ is a sum of Poisson r.v. and $T\in\text{Poisson}(n\theta)$, the c.d.f. of which is $$F_{\theta}(t)=\sum_{k=0}^t\frac{(n\theta)^k}{e^{n\theta}k!}$$

Taking a derivative with respect to $\theta$ gives: $$\frac{\partial F_{\theta}(t)}{\partial\theta}=\frac{kn^k\theta^{k-1}e^{n\theta}-n(n\theta)^ke^{n\theta}}{e^{2n\theta}k!}=\frac{n^k\theta^{k-1}(k-n\theta)}{e^{n\theta}k!}<0\quad\text{for }k<n\theta$$ which means that for such $k$ $F_{\theta}(T)>F_{\theta}(T+0)$.

The confidence interval is given by $$(\underline{\theta_n},\overline{\theta_n})=\{\theta: \gamma_1<F_{\theta}(T+0),F_{\theta}(T)<\gamma_2\}$$ where $\gamma_1,\gamma_2\in(0,1):\gamma_2-\gamma_1=\gamma$.

By definition of the c.d.f., \begin{align*}&F_{\theta}(T)=\sum_{k=0}^{n\overline{X}-1}\frac{(n\theta)^k}{e^{n\theta}k!}=\gamma_2\\&F_{\theta}(T+0)=\sum_{k=0}^{n\overline{X}}\frac{(n\theta)^k}{e^{n\theta}k!}=\gamma_1\end{align*} Using the relationship between $\text{Poisson}$ and $\chi^2$, \begin{align*}&F_{\theta}(T)=1-H_{2n\overline{X}}(2n\theta)\\&F_{\theta}(T+0)=1-H_{2(n\overline{X}-1)}(2n\theta)\end{align*} where $H_{2n\overline{X}}(x)$ is the c.d.f. of a $\chi^2(2n\overline{X})$ r.v. Assuming a symmetric CI?, \begin{align*}&1-F_{\theta}(T)=H_{2n\overline{X}}(2n\theta)=\frac{1-\gamma}{2}\quad\dagger\\&F_{\theta}(T+0)=1-H_{2(n\overline{X}-1)}(2n\theta)=\frac{1-\gamma}{2}\quad\ddagger\end{align*} which gives \begin{align*} &\overline{\theta_n}=\frac{1}{2n}\chi^2_{\frac{1+\gamma}{2}}(2n\overline{X})\\ &\underline{\theta_n}=\frac{1}{2n}\chi^2_{\frac{1-\gamma}{2}}(2n\overline{X}-2) \end{align*} I know how to use this formula to compute the CI in R, but I don't know how to argue to explain the transition to expressions $\dagger,\ddagger$. I also don't understand how to obtain the approximate interval $$\big(\overline{X}-z_{\gamma}\sqrt{\frac{\overline{X}}{n}}, \overline{X}+z_{\gamma}\sqrt{\frac{\overline{X}}{n}}\big)$$ where $z_{\gamma}$ is the $\gamma-$quantile of the standard normal distribution. I know I can view the $\chi^2$ distribution as a sum of squared normals that, by CLT, will tend to normal, but this argument doesn't seem to give the solution I need.

My questions are,

  1. How can I justify the transition to $\dagger,\ddagger$?
  2. Is there a way to derive the approximate CI from where I am now? What should I do to get this result?
  • 1
    See. The 'Wald' CI in your last display requires large $\bar X.$ There are many styles of CIs for Poisson $\lambda,$ each with its own advantages/disadvantages. – BruceET Apr 23 '22 at 14:49

1 Answers1

1

In much the same way that the Agresti-Cooll CI for binomial $p$ approximately inverts the normal test for $H_0: p = p_0$ vs. $H_a: \ne,$ the following 95% CI for $\lambda$ approximately inverts the normal test for $H_0: \lambda = \lambda_0$ vs $H_a: \ne.$

Example: Suppose you have $X_1, \dots, X_n$ iid $\mathsf{Pois}(\lambda),$ so that $$T = \sum_{i=1}^n X_i \sim \mathsf{Pois}(n\lambda).$$ Then an approximate 95% CI for $\lambda$ is $$(T + 2)/n \pm (1.96\sqrt{T+1})/n.$$

set.seed(2022);  lam=10; n = 50
x = rpois(n, lam)
t = sum(x)
CI = (t + 2 + qnorm(c(.025,.975))*sqrt(t+1))/n
CI
[1]  9.373022 11.146978  ## contains lam=10

Under CIs, the Wikipedia page on Poisson distributions suggests the CI with endpoints computed in R below:

 .5*qchisq(.025, 2*t)/n; .5*qchisq(.975, 2*t + 2)/n
 [1] 9.352981
 [1] 11.14577
BruceET
  • 56,185