2

I concern about why country-level variables normally have higher standard deviation compared to that in firm-level variables.

Today, my senior friend told me that it seems to be because the firm-level variables have more observations compared to country-level variables, leading to such an asymmetric standard deviation.

I am wondering is there any mathematical, reference document, or intuitive way to explain such a justification.

Phil Nguyen
  • 1,130
  • 3
  • 10
  • 27

1 Answers1

6

Intuitively, variables that only vary at the group level should have a lower variance (and therefore lower standard deviation) than comparable variables that vary at the individual level.

As these variables do not vary within each group, their within-group variance is zero, so their variance is solely determined by the between-group variation.

To see this, let $i$ be individual and $g$ be the group level (e.g. country). Let us denote the means and conditional means as: $$ \begin{align*} &x_{..} = \mathbb{E}(x_{i,g}),\\ &x_{.g} = \mathbb{E}(x_{i,g}|g) \end{align*} $$ Then, we can write: $$ \begin{align*} var(x_{ig}) &= \mathbb{E}(x_{ig}^2) - x_{..}^2,\\ &= \sum_g \left(\mathbb{E}(x_{ig}^2|g) - x_{.g}^2 \right)\Pr(g) + \sum_g \left( x_{.,g}^2 - x_{..}^2 \right) \Pr(g),\\ &= \sum_g var(x_{i,g}|g) \Pr(g) + var(x_{.,g}) \end{align*} $$ So we see that the variance of $x_{i,g}$ can be written as a weighted sum of the within group variances plus the between-group variance.

Now, consider two variables $x_{i,g}$ and $y_{i,g}$ such that $x_{.g} = y_{.g}$ but $y$ only varies on the group level, (so $y_{i,g} = y_{.g}$ for all $i$).

As they have the same group mean, we have that: $$ var(x_{.g}) = var(y_{.g}). $$ also $$ var(y_{i,g}|g) = 0, \text{ while } var(x_{i,g}|g) \ge 0. $$ As such: $$ var(x_{i,g}) \ge var(y_{i,g}). $$ This captures the simple fact that averaging across groups can only decrease the variance. In the extreme case where there is only one group, we get that $y_{.g} = y_{..}$, so the variance drops to zero.

tdm
  • 11,747
  • 9
  • 36
  • Thank @tdm. So, in short, do you mean firm-level should have a higher standard deviation compared to comparable country-level variables? – Phil Nguyen Jun 15 '21 at 08:28
  • 1
    @BeautifulMindset That's indeed what happens if the variables are comparable (in terms of within group mean). Notice that this is different from the st-dev of the estimated coefficients in a regression (which, I guess, was the question in the linked post). – tdm Jun 15 '21 at 08:53
  • I got it now, thank you, @tdm – Phil Nguyen Jun 15 '21 at 08:55
  • Hi @tdm, I get lost at this place, can you please tell me where it comes from then?$$\begin{align*} var(x_{ig}) &= \mathbb{E}(x_{ig}^2) - x_{..}^2$$ – Phil Nguyen Jun 17 '21 at 23:26
  • 1
    @BeautifulMindset We have that $var(x) = \mathbb{E}((x - \mathbb{E}(x))^2)$. If you work this out, you should get that $var(x) = \mathbb{E}(x^2) - \mathbb{E}(x)^2$. Here $x_{..} = \mathbb{E}(x)$, which gives the formulat you are referring to. – tdm Jun 18 '21 at 05:23
  • I just came back and let you know that I prove it successfully, thank you so much – Phil Nguyen Jun 18 '21 at 22:05