3

For an iid sample of $n$ realizations of random variable $X$ with $\mathbb{E}\left[X\right] = \mu > 0$, let $\bar{X}_n$ denote its sample mean. Is there a distribution for $X$ such that $\mathbb{E}\left[\frac{1}{\bar{X}_n}\right] < \frac{1}{\mathbb{E}\left[\bar{X}_n\right]}$?

Jensen's inequality shows that if the support of $X$ only contains positive numbers, this cannot be the case, but is there a good counterexample if we don't impose this restriction?

Thanks in advance!

Jakob J
  • 43
  • Thank you for your input. The generated random variable will however have a negative expectation and thus not provide a counterexample. – Jakob J Feb 05 '23 at 22:50
  • 1
    Sorry, missed that bit. – Glen_b Feb 05 '23 at 22:50
  • 1
    My strategy would be to look at a family of positive r.v.s where the ratio of arithmetic to harmonic mean was monotonic in some parameter (there seems to be a couple of fairly accessible potential choices) and then choose a mix of two distinct family members (with the second negated) such that the expectation was just positive but the parameter in the second component was chosen so that the harmonic mean would pull that relationship the other way. That should at least accomplish it for some $n$ (like $n=1$) though not necessarily all $n$. – Glen_b Feb 05 '23 at 23:00
  • 1
    https://stats.stackexchange.com/questions/141766 and https://stats.stackexchange.com/questions/305713 are relevant. – whuber Feb 06 '23 at 15:16

1 Answers1

2

At least when $n = 1$, the counterexample is easy to construct. For example, let $X$ be a triple-point distribution: \begin{align} P[X = -2] = 0.2, P[X = -1] = 0.25, P[X = 2] = 0.55. \end{align} Then \begin{align} & E[X] = -0.4 - 0.25 + 1.1 = 0.45 > 0, \\ & E[1/X] = -0.1 - 0.25 + 0.275 = -0.075 < 0. \end{align} Clearly, $E[1/X] < 1/E[X]$.

For general $n$, the following fact (Asymptotic Statistics by A. W. van der Vaart, Exercise 5.8) is also interesting: although it is not an example of $E[\bar{X}_n^{-1}] < 1/E[\bar{X}_n]$, it can be used as to show that $E[\bar{X}_n^{-1}] \geq 1/E[\bar{X}_n]$ does not always hold (in fact, it shows that for a large family of distributions, the expectation of $\bar{X}_n^{-1}$ either does not exist or is infinite):

Let $X_1, X_2, \ldots, X_n$ be i.i.d with expectation $1$ and finite variance. If the random variables are sampled from a density $f$ that is bounded and strictly positive in a neighborhood of zero, then $E[|\bar{X}_n^{-1}|] = \infty$ for every $n$.


Proof of the fact. It can be shown by induction that the density of $\bar{X}_n$ is bounded away from zero in a neighborhood of zero for every $n$.

When $n = 1$, this is the condition.

Suppose the density of $\bar{X}_k$ is bounded away from zero in a neighborhood of zero, i.e., there exists $\delta > 0$ such that $f_{\bar{X}_k}(x) \geq \epsilon > 0$ for all $x \in (-\delta, \delta)$. By condition, there exists $\delta' > 0$ such that $f_{X_{k + 1}}(x) = f(x) \geq \epsilon' > 0$ for all $x \in (-\delta', \delta')$. Let $\delta_0 = \min(\delta, \delta')$, $\epsilon_0 = \min(\epsilon, \epsilon')$, then $\min(f_{\bar{X}_k}(x), f(x)) \geq \epsilon_0 > 0$ for all $x \in (-\delta_0, \delta_0)$.

Since $\bar{X}_{k + 1} = \frac{k}{k + 1}\bar{X}_k + \frac{1}{k + 1}X_{k + 1}$ and the two summands are independent, the density of $\bar{X}_{k + 1}$ can be obtained by the convolution formula (and densities of the scale family): \begin{align} & f_{\bar{X}_{k + 1}}(y) = \int_{-\infty}^\infty f_{(k + 1)^{-1}X_{k + 1}}(y - x)f_{(k + 1)^{-1}k\bar{X}_k}(x)dx \\ =& \int_{-\infty}^\infty (k + 1)f((k + 1)(y - x))k^{-1}(k + 1)f_{\bar{X}_k}(k^{-1}(k + 1)x)dx \\ =& k^{-1}(k + 1)^2\int_{-\infty}^\infty f((k + 1)(y - x))f_{\bar{X}_k}(k^{-1}(k + 1)x)dx \\ \geq & k^{-1}(k + 1)^2\int_{y - (k + 1)^{-1}\delta_0}^{y + (k + 1)^{-1}\delta_0} f((k + 1)(y - x))f_{\bar{X}_k}(k^{-1}(k + 1)x)dx. \tag{1} \end{align}

If $|y| < \frac{1}{2}\delta_0$, then when $x \in (y - (k + 1)^{-1}\delta_0, y + (k + 1)^{-1}\delta_0)$, we have $(k + 1)(y - x) \in (-\delta_0, \delta_0)$ and $k^{-1}(k + 1)x \in \left(-\frac{k + 3}{2k}\delta_0, \frac{k + 3}{2k}\delta_0\right) \subset (-\delta_0, \delta_0)$ (when $k \geq 3$), hence the last integral in $(1)$ is at least $2k^{-1}(k + 1)\delta_0\epsilon_0^2 > 0$. This completes the inductive step. Therefore the density of $\bar{X}_n$ is bounded away from zero in a neighborhood of zero for every $n$.

For fixed $n$, suppose $f_{\bar{X}_n}(x) \geq \epsilon^*$ for all $x \in (-\delta^*, \delta^*)$ for some $\epsilon^* > 0$ and $\delta^* > 0$. For any $a \in (0, \delta^*)$:
\begin{align} & E[|\bar{X}_n^{-1}|] = \int_\mathbb{R} \frac{1}{|x|}f_{\bar{X}_n}(x)dx \\ \geq & \int_a^{\delta^*}\frac{1}{x}f_{\bar{X}_n}(x)dx \\ \geq & \epsilon^*\int_a^{\delta^*}\frac{1}{x} dx \\ =& \epsilon^*(\log\delta^* - \log a), \end{align} which diverges to $+\infty$ as $a \downarrow 0$. Therefore, $E[|\bar{X}_n^{-1}|] = \infty$. This completes the proof.

Zhanxiong
  • 18,524
  • 1
  • 40
  • 73