3

Am I right in thinking that it is the average of the sum of $n$ different populations means?

Here it is used in the context that confused me. It's the Chebychev WLLN, apparently.

"If $x_i, i = 1, . . ., n$ is a sample of $n$ observations such that $E[x_i] = \mu_i < \infty$ and Var[$x_i] = \sigma_i^2$ such that $\bar\sigma_i^2/n = (1/n^2)\Sigma_i \sigma_i^2 \rightarrow 0$ as $n \rightarrow \infty$ then $plim(\bar x_n - \bar\mu_n$) = 0."

Is this saying that each sample of $i$, corresponds to it's own population of $i$ (from above, $E[x_i] = \mu_i$) and as the sample get bigger we have to average over populations?

So if I were to draw the random variable 1 from a population of {1,2,3} and the random variable 4 from the population {4,5,6} then the population of my sample 1,4 is {1,2,3,4,5,6}?

amoeba
  • 104,745
EconStats
  • 865

1 Answers1

4

You are right.

"If $x_i, i = 1, . . ., n$ is a sample of $n$ observations such that $E[x_i] = \mu_i < \infty$ and Var[$x_i] = \sigma_i^2$ such that $\bar\sigma_i^2/n = (1/n^2)\Sigma_i \sigma_i^2 \rightarrow 0$ as $n \rightarrow \infty$ then

$$\lim_{n\rightarrow \infty}P\left(\left|\frac 1n\sum_{i=1}^nX_i - \frac 1n\sum_{i=1}^nE(X_i)\right|<\epsilon\right) =1$$

I guess you can make the notational mapping.

Since by design we assume different moments for each $X_i$, each comes from a different population. So if by $\{1,2,3\}$ you mean values of the index $i$, then $\{1,2,3\}$ is not a population, but a set including three values of the index with each value representing a different population.

If you consider the random variables $\{X_1,X_4\}$, it is a pair coming from two different populations -you do not "unite" the two populations "into one" because, being different with respect to the object of study (convergence of sample moments), how could they form a single population (for the purposes of the specific study)? Have you contemplated how is the abstract concept of "statistical population" defined?

  • No I didn't intend for {1,2,3} to be the index values of $i$, I meant for {1,2,3} to be the population that the sample $x_1$ is drawn from. For example, the sample $x_1$ has one element {1} but $\mu_1 = 2$ (1+2+3/3) – EconStats Jul 26 '14 at 21:55
  • You mean the values that the random variable can take? This is never called a "population". It is the set from which the members of the population take their values, usually called the "support". In the example you state in the question, the joint support is the cartesian product ${1,2,3} \times {4,5,6}$ You have a two-dimensional vector here - the support should also be two-dimensional. – Alecos Papadopoulos Jul 26 '14 at 21:59
  • Continuing with the index notation for a moment, in this example do you think that $x_1$ and $x_n$ are draws from 2 different population but the number of observations in sample $n$ is greater than the number of observations in sample 1. If this is the case though(and I'm not saying it is, because I don't fully understand it), then we don't need to average across populations because $plim(\bar x_n - \mu_n) = 0$. I guess I'm wondering what is happening when the index is going to $n$. Are we drawing larger numbers from the same population or from different populations? – EconStats Jul 26 '14 at 22:05
  • With respect to the most recent comment, I had actually never heard the word "support" used like that (though I had often used the phrase "common support" when doing PSM. So all the values that a population mean are calculated from are called the "support"? – EconStats Jul 26 '14 at 22:15
  • As per the terminology, the "support" of the distribution of a random variable, is standard at least in mathematical statistics, for the range of the random variable, which in turn is the domain of the probability functions (and from where we "draw" realizations of random variables). – Alecos Papadopoulos Jul 26 '14 at 23:01
  • First, please see part B of my answer to this question http://stats.stackexchange.com/questions/107912/what-is-the-difference-between-sample-and-outcome-plus-events-and-observations/107936#107936 regarding the use of the terms "sample" and "observation". $x_n$ (preferably $X_n$) does not denote the sum or the collection of all $X$'s up to $n$. It denotes only the $n-th$ random variable (upper case) or its realization (lower-case). The notation $\bar x_n$ does not imply that by removing the bar, we have that $x_n$ is the sum of all $X$'s up to that point (it is bad notation, anyway). – Alecos Papadopoulos Jul 26 '14 at 23:07
  • I was away, just saw your invitation. I am available now, if you are still at it. – Alecos Papadopoulos Jul 27 '14 at 00:23
  • Don't worry about responding, I think I have a million questions on this topic! I get the most basic version of the LLN but I think when we move out of the arena of iid sampling, heterogeneous distributions and different population I really need to see an example for it to really sink in. Kind of like doing a math problem, you need to complete one unassisted to truly understand the method – EconStats Jul 27 '14 at 00:49
  • That's true, indeed. – Alecos Papadopoulos Jul 27 '14 at 01:00