21

I encountered the following result during my research.

$$\lim\limits_{n\to \infty} \mathbb{E}\left[ \frac{\#\{|a_i-a_j|,1\le i,j\le m \}}{n} \right] = 1$$ where $m=\omega(\sqrt n)$ and $a_1,\cdots,a_m$ are chosen at random from $[n]$.

I am looking for a reference/a direct proof.


Crossposted on MO

Zhu Cao
  • 313
  • 1
  • 4
  • 1
    If $m = \sqrt{n}$, the maximum number of different differences you can get is $m(m-1)/2 < n/2$. So you really need $m$ to grow *faster* than $\sqrt{n}$ for this to be true. What I would do is try to calculate the probability that a number $d$ is not a difference $d = |a_i - a_j|$. – Peter Shor Jun 19 '15 at 11:18
  • @Shor:thanks, I updated the question. And indeed since $\mathbb E(\sum x_i)=\sum \mathbb E(x_i)$, it is easier to calculate for a specific difference $d$. – Zhu Cao Jun 19 '15 at 12:02
  • Let $m = \sqrt{(2+2\varepsilon)n}$. Then $m(m-1)/2 \ge n$ whenever $n \ge (1+\varepsilon)/2\varepsilon^2$. So instead of $m = \omega(\sqrt{n})$, it might be possible that $m \ge \sqrt{2(1+\varepsilon)n}$ is enough, for some $\varepsilon > 0$. – András Salamon Jun 20 '15 at 09:33
  • @Salamon:thanks for the comment. However, $m=c\sqrt{n}$ may not be enough. Consider the following heuristics. In $1,...,n/c^3$ and $n-n/c^3,...,n$, about $2\sqrt{n}/c^2$ numbers are selected. Then absolute differences in [n(1-1/c^3),n] will appear for 4n/c^4 times. Thus when $n\to\infty$, the upper bound on the expectation value is $1-1/2c^3$, not $1$. – Zhu Cao Jun 21 '15 at 03:12
  • I think that if you move this to mathoverflow and tag probability, you will get an answer much faster. – domotorp Jun 21 '15 at 12:03
  • @domotorp:thanks for the advice. Crossref http://mathoverflow.net/questions/209817/ – Zhu Cao Jun 21 '15 at 13:03
  • Presumably the $m$ samples are independent and uniform from the $n$ discrete values. Let $D_n$ denote the number of different differences in a sample of $n/2 = \omega(\sqrt{n})$ numbers chosen from ${1,2,\dots,n}$. If the result in the question is true, then $\lim_{n\to\infty} Pr[D_n = n] = 1$. Is this really the case? – András Salamon Jun 21 '15 at 17:12
  • 1
    @ZhuCao, when you say "choose $m$ integers $a_1,...,a_m$ randomly from $[1,n]$", what distribution do you mean exactly? I was assuming i.i.d. uniform ${1,\dots,n}$. – usul Jun 21 '15 at 19:55
  • 1
    @Andras no, it is not the case. For example, if the number $1$ is not chosen (which happens with probability bounded away from 0) then the difference $n-1$ cannot appear, and $D_n<n$. But why does it need to be the case? The question only asks that the expectation of $D_n/n$ gets close to 1, not that $D_n$ is equal to 1 with high probability. – James Martin Jun 21 '15 at 20:48
  • 2
    Please don't cross-post on multiple Stack Exchange sites. Our site policy forbids simultaneous cross-posting: it says, at a minimum, wait a week. And if you get no good answer, you can always flag it for moderator attention to ask that it be migrated. – D.W. Jun 22 '15 at 00:52

1 Answers1

7

Assume as given that $m=\omega(\sqrt{n})$.

Fix any $\epsilon>0$. We will consider $r\in[1,n]$ with $r<(1-\epsilon)n$. The aim is to show that with high probability as $n\to\infty$, $r$ is included in the set of differences.

First consider the set $A=\{a_i:i<m/2\}\cap[1,\epsilon n]$. The number of $i$ with $i<m/2$ such that $a_i<\epsilon n$ is binomial with expectation around $\epsilon m/2$. So with high probability as $n\to\infty$, the number of such $i$ will be at least $\epsilon m/4$, which is $\omega(\sqrt{n})$. Then (claim, "left as exercise", not hard to show) with high probability as $n\to\infty$, the set $A$ has size at least $\sqrt{n}$. Let us write $G$ for this "good event", that $|A|\geq\sqrt{n}$.

Suppose that indeed $G$ holds, i.e. there are at least $\sqrt{n}$ distinct values of $a_i$ less than $\epsilon n$, for $i<m/2$. Note that for each such value, there is a value $k\in [1,n]$ which is precisely $r$ larger. Now consider the values of $a_i$ for $i\geq m/2$. These are independent and each one has probability at least $\sqrt{n}/n=1/\sqrt{n}$ of being at distance $r$ from an element of the set $A$. The probability that no difference $r$ is produced is then at most $(1-1/\sqrt{n})^{m/2}$ which goes to 0 as $n\to\infty$ since $m=\omega(\sqrt{n})$. So indeed, the probability that $G$ holds but no difference of size $r$ exists tends to 0 as $n\to\infty$.

So (uniformly in $r<(1-\epsilon)n$) the probability that $r$ is included in the set of differences tends to 1 as $n\to\infty$. Hence using linearity of expectation, $$ \liminf \limits_{n\to \infty} \mathbb{E}\left[ \frac{\#\{|a_i-a_j|,1\le i,j\le m \}}{n}\right] \geq 1-\epsilon.$$ Since $\epsilon$ is arbitrary, the limit is 1 as desired.

James Martin
  • 186
  • 4