Why the larger the sample, the lower standard deviation?

Question

I concern about why country-level variables normally have higher standard deviation compared to that in firm-level variables.

Today, my senior friend told me that it seems to be because the firm-level variables have more observations compared to country-level variables, leading to such an asymmetric standard deviation.

I am wondering is there any mathematical, reference document, or intuitive way to explain such a justification.

You seem to be linking to a question without any answers or comments. — Giskard, Jun 15 '21 at 06:53

score 6 · Accepted Answer · answered Jun 15 '21 at 08:22

Intuitively, variables that only vary at the group level should have a lower variance (and therefore lower standard deviation) than comparable variables that vary at the individual level.

As these variables do not vary within each group, their within-group variance is zero, so their variance is solely determined by the between-group variation.

To see this, let $i$ be individual and $g$ be the group level (e.g. country). Let us denote the means and conditional means as: $$ \begin{align*} &x_{..} = \mathbb{E}(x_{i,g}),\\ &x_{.g} = \mathbb{E}(x_{i,g}|g) \end{align*} $$ Then, we can write: $$ \begin{align*} var(x_{ig}) &= \mathbb{E}(x_{ig}^2) - x_{..}^2,\\ &= \sum_g \left(\mathbb{E}(x_{ig}^2|g) - x_{.g}^2 \right)\Pr(g) + \sum_g \left( x_{.,g}^2 - x_{..}^2 \right) \Pr(g),\\ &= \sum_g var(x_{i,g}|g) \Pr(g) + var(x_{.,g}) \end{align*} $$ So we see that the variance of $x_{i,g}$ can be written as a weighted sum of the within group variances plus the between-group variance.

Now, consider two variables $x_{i,g}$ and $y_{i,g}$ such that $x_{.g} = y_{.g}$ but $y$ only varies on the group level, (so $y_{i,g} = y_{.g}$ for all $i$).

As they have the same group mean, we have that: $$ var(x_{.g}) = var(y_{.g}). $$ also $$ var(y_{i,g}|g) = 0, \text{ while } var(x_{i,g}|g) \ge 0. $$ As such: $$ var(x_{i,g}) \ge var(y_{i,g}). $$ This captures the simple fact that averaging across groups can only decrease the variance. In the extreme case where there is only one group, we get that $y_{.g} = y_{..}$, so the variance drops to zero.

Thank @tdm. So, in short, do you mean firm-level should have a higher standard deviation compared to comparable country-level variables? — Phil Nguyen, Jun 15 '21 at 08:28
@BeautifulMindset That's indeed what happens if the variables are comparable (in terms of within group mean). Notice that this is different from the st-dev of the estimated coefficients in a regression (which, I guess, was the question in the linked post). — tdm, Jun 15 '21 at 08:53
Hi @tdm, I get lost at this place, can you please tell me where it comes from then?$$\begin{align*} var(x_{ig}) &= \mathbb{E}(x_{ig}^2) - x_{..}^2$$ — Phil Nguyen, Jun 17 '21 at 23:26
@BeautifulMindset We have that $var(x) = \mathbb{E}((x - \mathbb{E}(x))^2)$. If you work this out, you should get that $var(x) = \mathbb{E}(x^2) - \mathbb{E}(x)^2$. Here $x_{..} = \mathbb{E}(x)$, which gives the formulat you are referring to. — tdm, Jun 18 '21 at 05:23
I just came back and let you know that I prove it successfully, thank you so much — Phil Nguyen, Jun 18 '21 at 22:05

Why the larger the sample, the lower standard deviation?

1 Answers1