Can we estimate the probability $\mathbf{P}(a-k|a - b) $ on a random graph?

Question

Let $G=(V,E)$ be an undirected random graph such that

$V$ is the set of nodes, and $E$ is the set of edges
Assume the ground graph $G$ is sparse enough, for example, $\frac{|E|}{|V|}= c \in [10, 40]$ or some not large number ?
each edge $uv\in E$ is associated with a probability $p_{uv}$, i.e., $uv$ is kept with probability $p_{uv}$, otherwise removed.

Let $a,b,k$ be three nodes of the graph.

Let $a-b$ be the event that $a$ is reachable from $b$. Of course, because the graph is undirected, $b$ is also reachable from $a$.

My question is

Can we estimate the conditional probability $\mathbf{P}(a-k|a - b)$?

When the graph only has $a,b,k$, and $p_{ab}=1$, then its value is $p_{ak}+(1-p_{ak})p_{bk}$. But this case impossible when the graph is big enough.

When $a=b$, then its value is $\mathbf{P}(a-k,a-k|a - a)=\mathbf{P}(a-k)$.

What about the general case?

Do you know anything about the values of the $p_{uv}$? This is likely to be extremely difficult in full generality, but straightforward for many particular examples. For instance, if the $p_{uv}$ are constant and $|G|$ is large then all of your expressions are approximately equal to $1$ as $G$ is very likely to be connected. — Ben Barber, Aug 17 '15 at 15:46
@BenBarber OK, we assume that $p_{uv}$ are small enough, for example, $\sup_{uv} p_{uv} \leq 0.05$ or at least most probabilities are small enough. And assume $G$ is large enough which has thousands or millions of nodes. — Wieshawn, Aug 17 '15 at 16:30

score 3 · Accepted Answer · edited Apr 13 '17 at 12:58

If you're willing to take $n$ large then we can use some known results about the "giant component," since the asymptotic situation is a good approximation to reality. Here is a paper by Erdos and Renyi, and here is a nice expository write-up (the break-down on pages 3 and 4 is excellent). If $p = c\log(n)/n$ then the random graph is connected, so your $P(a-b|a-k) = 1$ for any $a,b,k$. See also this MO question. If $p \sim c/n$ for a constant $c>1$ then there is a giant component and some smaller components. If you look into the Erdos-Renyi paper you can write down the probability a vertex lands in the giant component as a function of $n$. Let's call this probability $q$. Since there's no relationship between b and k, the probability you seek is bounded below by the probability that each of a,b,k are in the giant component, which is approximately $q^3$. You can get even more precise if you want, using the distribution of smaller components to discuss the situation where a and b are in the same component but not the giant component. In that case your $P(a-k|a-b)$ is just $P(k$ is in that component too).

If $p$ is very small compared to $n$, like $o(1/n)$ then $G(n,p)$ is a disjoint union of trees, and Erdos-Renyi analyze the sizes of these trees. You can use their formulas to compute for each tree the probability that $k$ is in the tree. Then take the weighted sum over all trees (weighted by the size of the tree, which is proportional to the probability $a$ is in the tree) and you can compute $P(a-k)$. I think here the extra information that $a-b$ is not going to help much, since $b$ and $k$ have no relationship. To me, knowing $a-b$ occurs just tells me that either $a=b$ or that $a$ is not isolated. The first is very unlikely if $a,b,k$ are chosen at random, while the second can be factored into your computation by ruling out trees of size 1, again using the formulas from Erdos-Renyi to see how likely those trees are. You can conduct a similar analysis for $p\sim c/n$ for $c<1$ or other cases, but I think those in the first paragraph sound more likely to be helpful based on your question. The lecture notes are a good reference.

Thanks. Erdos and Renyi assume the graph are generated from the ground complete graph with $C_n^2$ edges. But I intended the ground graph sparse enough, for example $\frac{|E|}{|V|} = c \in [10, 40]$? Under such assumption, I think the probability would not be close to $1$? — Wieshawn, Aug 18 '15 at 02:00
I'm having trouble parsing your comment. You ask if the probability is not close to 1. Could you clarify by telling me which probability you're talking about? — David White, Aug 18 '15 at 11:41
Also, just to clarify, Erdos-Renyi write $C_n^N$ to mean (n choose 2) choose N. But when you write $C_n^2$ you must mean n choose 2 (i.e. all possible options). I would not say they start with a ground complete graph. I think they start with an empty graph on $n$ vertices and add edges randomly (I suppose equivalently they could start with a complete graph and discard edges randomly). In your new edit I guess you mean some edges are already there and you randomly add more edges? — David White, Aug 18 '15 at 11:54

Can we estimate the probability $\mathbf{P}(a-k|a - b) $ on a random graph?

1 Answers1