0

(This question can be considered a follow-up to the following question - About Sampling and Random Variables)

I am taking a statistics course and every resource I look at says the following - let X be a random variable with $E(X)=\mu$, and $s_1$, $s_2$, ..., $s_n$ be $n$ samples drawn from X. The sample mean is defined to be $\bar{s}=(s_1+s_2+...+s_n)/n$. Then $E(\bar{s})=\mu$ and $\bar{s}$ is an unbiased estimator. The question is - since $\bar{s}$ is NOT a random variable (see link above), how can its expectation be talked about?

  • 1
    I do not get the point: $\bar s$ is a random variable. – Xi'an Jun 12 '20 at 11:59
  • Apparently not. I think they consider $\bar{s}$ the fixed value obtained post sampling, and thus do not treat it as a random variable. I agree though that it makes $\bar{s}$ meaningless, if $\bar{S}$ is the only object you can reason about. – stebahpla Jun 13 '20 at 00:46

2 Answers2

1

I think your understanding of the issue might be helped by reviewing the difference between an estimate and and estimator, as well as perhaps some more careful notation.

Suppose $S_1, S_2, \ldots, S_n$ are IID random variables with the same distribution as $X$ and $E(X) = \mu$. Then the mean of these random variables is $$ \bar{S} = \frac{1}{n} \sum_{i=1}^n S_i.$$

Now $E(\bar{S}) = \frac{1}{n} \sum_{i=1}^n E(S_i) = \mu $.

We call $\bar{S}$ an estimator; it is a random variable and therefore has a probability distribution. This distribution has mean $\mu$.

Now suppose I observe data $(s_1, s_2, \ldots, s_n)$. This has mean $$\bar{s} = \frac{1}{n} \sum_{i=1}^n s_i.$$ $\bar{s}$ is an estimate of $\mu$. It is a realisation of the estimator $\bar{S}$, in the same way that each $s_i$ is a realisation of the random variable $S_i$.

jcken
  • 2,907
  • Thanks for the clarification. Why is $\bar{s}$ then even meaningful at all? And why is it even talked about? Since the only object that can be reasoned about is $\bar{S}$, the estimate $\bar{s}$ is entirely meaningless. – stebahpla Jun 13 '20 at 00:49
  • $\bar{s}$ is the thing we actually observe, it is a summary of the data. It tells us about the population mean of $\bar{S}$ and $X$. It is not meaningless at all – jcken Jun 13 '20 at 06:23
  • What I mean is, $\bar{S}$ can be reasoned about, you can find $E(\bar{S})$ or higher moments, $E(f(\bar{S}))$ for some function f, or $Pr(\bar{S}>\alpha)$, and so on. But all you can say about $\bar{s}$ is that it is a real number. Thats it. Any more reasoning cannot be continued without appealing to $\bar{S}$, for it is simply a deterministic number. If you try to reason where that number came from or anything else, you go back to $\bar{S}$. The value $\bar{s}$ indeed seems meaningless without any concrete application. – stebahpla Jun 13 '20 at 11:34
0

As you can read in About Sampling and Random Variables, "You can think of a "population" as an infinite reservoir of values drawn from $Y$. Sampling from a population is analogous to repeatedly drawing new values from $Y$. A sample of size $N$ is a size-$N$ collection of individual draws from $Y$." Moreover, "If you flip one coin $N$ times, that is the same as flipping $N$ identical coins, once per coin."

A random sample is like $N$ identical coins, it is a set of independent and identically distributed random variables, $X_1,X_2,\dots,X_N$. Their mean $\overline{X}_N$is a random variable end its expected value is $E[\overline{X}_N]=E[X_1]=\mu$.

An observed random sample is a set of observed values (let's say, the heads and tails you get when you flip $N$ identical coins), and the observed sample mean is the realization of the sample mean random variable.

Sergio
  • 5,951