6

I've been playing around a little with basic probability and I was wondering if there is any intuitive meaning to either,

$$ P(A|B)P(B|C) \quad\text{or}\quad \sum_{B} P(A|B)P(B|C). $$

After playing with the definitions I didn't see any obvious simplifications, but I feel there should be one. I thought it might simplify to $P(A|C)$ but that doesn't work in general. And it's definitely not $P(A,B,C)$ or $P(A,C)$.

whuber
  • 322,774
  • Are you familiar with probability trees? – whuber Jul 22 '13 at 20:05
  • Yes, but I can't find one that would correspond. – Christian Bueno Jul 22 '13 at 23:30
  • The two expressions invite you to draw a tree with nodes corresponding to $A$ and $C$ and intermediate nodes along the path from $C$ to $A$ corresponding to either one or multiple $B$'s. – whuber Jul 22 '13 at 23:45
  • Are A and C independent, dependent, or is this not known? – Glen_b Jul 23 '13 at 00:19
  • @Glen_b I was hoping to find an interpretation that did not rely on knowing any dependency information. – Christian Bueno Jul 23 '13 at 01:14
  • @whuber I had already tried looking at it in that way but it doesn't apply (at least not how I see it) because the appropriate weights on the edges would be $P(B|C)$ for $C\to B$ and $P(A|B,C)$ for $B\to A$ to yield a net probability of $P(A,B,C)$ at the leaf $A$. – Christian Bueno Jul 23 '13 at 01:14
  • @ChristianBueno Okay. I am not aware of a particularly intuitive meaning in general. In some specific situations, perhaps. – Glen_b Jul 23 '13 at 03:46

1 Answers1

10

There are several good visual and physical metaphors to help the intuition. I offer one of each.

Conditional probabilities of the form $\Pr(A|C)$ can be represented on graphs where the events $C$ and $A$ are nodes, a directed edge connects $C$ to $A$, and the edge is weighted by this probability. This graphical metaphor, variously known as a "probability tree" or "hierarchical network," visualizes two axiomatic properties of probability.

Figure

  1. Path probabilities are products of edge probabilities: $\Pr(A,B,C) = \left[\Pr(A|B)\Pr(B|C)\right]\Pr(C)$. The nodes $C,B,A$ form a path along this tree from $C$ to $A$. The left hand side is the chance of this path; the right hand side is the product of the chance of being at the beginning and the chance of following the path. That latter chance, the "path probability" (written in brackets), is the product of the conditional probabilities.

  2. Total probabilities are sums over all possible disjoint paths: $\Pr(A) = \Pr(A|B_1)\Pr(B_1) + \Pr(A|B_2)\Pr(B_2) + \cdots + \Pr(A|B_n)\Pr(B_n)$ where $B_1, B_2, \ldots, B_n$ are mutually exclusive events comprising all the incoming nodes at $A$.

The notation suggests flow from right to left. It might be more felicitous to use a different symbol and employ infix (rather than prefix) notation for conditional probabilities, writing $\Pr(B|C)$ as "$B\overset{\Pr}{\leftarrow} C$", perhaps. Then, using this purely syntactic modification, we could write the key part of (1) as

$$A\overset{\Pr}{\leftarrow}B\overset{\Pr}{\leftarrow} C = (A\overset{\Pr}{\leftarrow} B)(B\overset{\Pr}{\leftarrow} C) = \Pr(A|B)\Pr(B|C),$$

which is visually obvious. We might also invent a shorthand for (2) along the lines of

$$A\overset{\Pr}{\leftarrow}\{B_i\}\overset{\Pr}{\leftarrow} C = \sum_i A\overset{\Pr}{\leftarrow}B_i\overset{\Pr}{\leftarrow} C = \sum_i \Pr(A|B_i)\Pr(B_i|C).$$

These axioms work exactly as if probability were a conserved fluid flowing through the graph. Think of water in pipes (or electricity in perfectly insulated widely separated wires). Absolute probabilities like $\Pr(C)$ are total amounts of water available at nodes. Conditional probabilities like $\Pr(B|C)$ are the proportions of water flowing out of one node $C$ into another node $B$. The first axiom asserts that the proportion (relative to the origin $C$) of water/probability flowing from $C$ to $A$ through an intermediate node $B$ is the proportion from $C$ to $B$ multiplied by the proportion from $B$ to $A$: it merely expresses the arithmetic of proportions or fractions. (Engineers might think of this as a statement about probabilities connected in series.) The second axiom says that the water/probability flowing from $C$ to $A$ is the sum of amounts of water/probability flowing through a collection of intermediate points $B_i$, no two of which are connected. (Engineers will refer to this as a parallel connection.) This is what it means to be conserved: no probability is gained, lost, or transferred among edges as it flows through the graph.

whuber
  • 322,774