0

It's abou Example 10.1.14 from Casella (2nd ed) For a random sample $X_1, \dots, X_n$, each having Bernoulli distribution ($P(X_i=1)=p$), we know $\mathrm{Var}_X=p(1-p)$.

It's said $\mathrm{Var}_p\hat{p}=\frac{p(1-p)}n$, my questions are

  1. What's the meaning of the subscript $p$?
  2. Why the variance is $\frac{p(1-p)}n$ instead of $p(1-p)$?

My thought: since $\hat{p}=\frac{\sum{X_i}}n$, and all $X_i$'s have the same variance, and n is a constant, and so the variance of $\hat{p}$ simply divided by n.

But even though all $X_i$'s are iid, they are still different random variables, so can we really calculate the variance of $\frac{\sum{X_i}}n$ this way? Not to say that we have added up n $X_i$, so it seems the variance should be $\frac{np(1-p)}n$, where n cancels out.


Edit:

  1. The subscript $p$ seems to be 'given condition the parameter has the value p'.
  2. It seems that $\mathrm{Var}_p\hat{p}=\mathrm{Var}_p\frac{\sum{X_i}}n =E((\frac{\sum{X_i}}n)^2)-(E(\frac{\sum{X_i}}n)))^2\\ =\sum_{k=0}^n[(\frac k n)^2{n\choose k}p^k(1-p)^{n-k}]-p^2.$

How to proceed from that? (This is already answered by @stochasticmrfox.)


Edit:

A related question (Example 10.1.17) is that suppose $X_i$'s are iid Poisson ($P(X_i=k)=\frac{\lambda^k}{k!}e^{-\lambda}$), and we try to estimate $P(X_i=0)=e^{-\lambda}$ using the function $\hat{\tau}=\frac{\sum I(X_i=0)}n$'s where $I$ indicate the event $X_i=0$ happening or not and has Bernoulli distribution w the parameter $e^{-\lambda}$.

And so $E(\tau)=e^{-\lambda}$, $\mathrm{Var}\ \tau=\frac{e^{-\lambda}(1-e^{-\lambda})}n.$ (From this we see with n increasing, the variance decreases, the estimation gets more precise.)

It is said MLE of $e^{-\lambda}$ is $e^{-\frac{\sum_i X_i}n}$, how do we get this?

My thought: this can be derived from the usual way of calculating MLE, (see https://statlect.com/fundamentals-of-statistics/Poisson-distribution-maximum-likelihood) treating $X_i$ as fixed to be $x_i$, and we find a $\lambda$ that gives max of log likelihood that $X_i=x_i$, i.e. we find the zero of $0=\log \lambda \sum x_i-\log \prod(x_i!)-n\lambda$, which is $\frac{\sum x_i}n$.

The new question is: From this we get MLE of $\lambda$, but I'm wondering why MLE of $e^{-\lambda}$ is $e^{- (\text{MLE of }\lambda)}$?

1 Answers1

3
  1. Not sure about the subscript.

$$Var(\hat{p})=Var(\frac{\sum{X_i}}{n})\\=\frac{1}{n^2}Var(\sum{X_i})\\=\frac{1}{n^2}\sum{Var(X_i})\\=\frac{n\times p(1-p)}{n^2}\\$$

where the last inequality follows by independence. Key is that $Var(aY)=a^2Var(Y)$ where a is a constant and Y is a random variable.

  • I see $\mathrm{Var}\sum X_i$ is the variance of binomial distribution, which is $np(1-p)$. Besides, is there a name for the proposition that you've just mentioned ($\mathrm{Var}(aY)=a^2\mathrm{Var}Y)$. I see this formula can be derived from $\mathrm{Var}X=E(X^2)-(EX)^2$ – Charlie Chang Oct 31 '20 at 10:04
  • Not sure why $\mathrm{Var} \sum X_i=\sum \mathrm{Var} X_i$, I will figure it out. In the book example 2.3.5 the author gives a relatively complicated proof for $\mathrm{Var} \sum X_i=np(1-p)$ – Charlie Chang Oct 31 '20 at 10:13
  • Besides the method given by the book we can also use the generating function $G(s)=(1-p+ps)^n$ and $G^{(r)}(1)=X(X-1)\dots(X-r+1)$ – Charlie Chang Oct 31 '20 at 10:45
  • 2
    It's a fundamental theorem: the variance of the sum of independent random variables is equal to the sum of the variances of the random variables – stochasticmrfox Oct 31 '20 at 12:57
  • 1
    +1. The reason for the subscript "$p$" is to make it explicit that the variance depends on the underlying distribution, which is parameterized by $p.$ This explicitness is most needed when discussing properties of estimators, where a sharp distinction between parameters and their estimates must be maintained. – whuber Oct 31 '20 at 15:34
  • I see. So the subscription is to specify what is the parameter for the distribution under consideration, and it is to help one distinguish the estimator from it. For example, $e^-\lambda$ can be the estimator and $\lambda$ the parameter (for Poisson distribution); but, as in Example 10.1.17, we can also use $e^-\lambda$ as the parameter (for Bernoulli distribution) and use $\lambda$ as the estimator for something. Overall it’s like $E/Var_\text{parameter} (\text{estimator})$ – Charlie Chang Oct 31 '20 at 15:59