5

I know that:

$P(X=x|Y=y)=\frac{P(X=x,Y=y)}{P(Y=y)}$

But I'd like to understand what $X=x|Y=y$ means by itself, for example $(X=x,Y=y)$ means $X=x$ and $Y=y$. Would $X=x|Y=y$ mean? that $X=x$ when $Y=y$?

A example to make it easier (take these as observations):

$X=\{1,6,7,8\}$

$Y=\{0,1,19,5\}$

$(X=x|Y=y) = ?$

Tim
  • 138,066
  • 3
    Some people will say that "$X=x\mid Y=y$" is not an event, at least not in the original probability space, so it is not meaningful – Henry Oct 03 '21 at 08:49
  • Read "$|$" as "given". Your example probably won't be helpful to you until you specify what $y$ is. – Glen_b Oct 03 '21 at 08:55
  • If for example $x=1$ and $y=0$ then you could read it as $(1,0)$ in the sample space with $\Omega_{Y=0}={(1,0), (6,0), (7,0), (8,0)}$ – Henry Oct 03 '21 at 09:09
  • @Henry a probablity can only evaluated over events since it's a function of sample space. – Davi Américo Oct 03 '21 at 09:31
  • @DaviAmérico The problem comes from trying to apply $\mathbb P(\cdot)$ to the non-event $X=1 \mid Y=0$ when in fact you are applying $\mathbb P(\cdot \mid Y=0)$ to the event $X=1$ – Henry Oct 03 '21 at 09:37
  • sorry I see no difference – Davi Américo Oct 03 '21 at 09:40
  • 1
    Your example has a probability space of $(\Omega, \mathcal F, \mathbb P)$ where $\mathcal F$ has $2^{16}$ events for which you can find the probability: one of these is $X=1$, another is $Y=0$, while a third is $X=1$ and $Y=0$. But $X=1 \mid Y =0$ is not one of them, and you can construct an example where your $P(X=1 \mid Y=0)$ is not any of the probabilities of any event. When you condition on $Y=0$, you have two choices, both of which change the space: (a) you can restrict $\Omega$ to only include cases where $Y=0$, changing $\mathcal F$ and $\mathbb P$, or (b) only change $\mathbb P$.... – Henry Oct 03 '21 at 10:04
  • 1
    ... Once you have made that change, it does not matter whether you say "$X=1 \mid Y=0$" means the event "$X=1$ and $Y=0$" or say it is the event "$X=1$" in the new probability space. Both will lead to the same probability in the conditional probability space, while neither will necessarily give the correct answer in the original probability space. This means "$X=1 \mid Y=0$" is not an event in the original probability space – Henry Oct 03 '21 at 10:11
  • 2
    The "|" symbol is part of the "P" syntax -- X=x|Y=y doesn't mean anything by itself – jwimberley Oct 03 '21 at 10:20
  • @Henry I get what you mean – Davi Américo Oct 03 '21 at 22:08
  • All of these answers are pretty good I'd consider all of them if I'd able to. – Davi Américo Oct 03 '21 at 22:09

3 Answers3

7

The $|$ symbol in probability theory stands for “given”. You would most commonly see it used for conditional probability $P(Y|X)$, the probability of $X$ given $Y$. While it’s a slight abuse of notation, you could see something like

$$ Y|X \sim \mathcal{N}(\mu, \sigma) $$

for $Y$ conditionally on $X$ following normal distribution. You would also see it to show some properties of distributions, like conditional expectations $E[Y|X]$, or variance $\operatorname{Var}(Y|X)$, etc.

Notice that something like $X|Y$ alone doesn’t make much sense. What would it be? “A random variable conditional on another random variable”? Conditioning is about the perspective you take when looking at the variable, not a property of the variable. You can “transform” conditional probability to joint, or marginal, or reverse it (Bayes theorem) with just simple mathematical manipulations on the distributions.

Tim
  • 138,066
  • (+1) Unfortunately not many elementary textbooks explicitly states the conditional to a distribution such as $Y|X \sim \mathcal{N}(\mu, \sigma)$ and treat conditional probabilities as an elusive concept. – patagonicus Oct 04 '21 at 07:42
6

Example: Say you have a group of men and women and know their handedness (left/right). It is like depicted in the table below $$\begin{array}{r|c|c | c} &\text{men}&\text{women} &\text{total}\\ \hline \text{left handed}&9&4&13\\\hline \text{right handed}&43&44&87\\\hline \text{total}&52&48&100 \end{array}$$

Say you pick randomly a person out of this group then it is $13\%$ probability that they are left handed. But if you know that the person is a woman, then the probability is $4/48 \approx 9 \%$.

To express this latter case, the probability of an event, given another event or condition, one uses the vertical bar symbol $\vert$.

$$P(X\vert Y) = \text{probability of event $X$ given/conditional on event $Y$}$$

So it is about both events $X$ and $Y$ happening. But, this is different from $P(X,Y)$, the probability that both $X$ and $Y$ are happening.

The probability for left handedness given that a person is a woman, is not equal to $4 \%$ the probability that someone is a woman and left handed.


The expression $X\vert Y$ occurs within the probability operator $P()$. But you should not read all the contents as a single event.

  • So this is not how you must interpret it: "$P(\dots)$ is probability of the event on the dots. So $P(X\vert Y)$ is the probability of the event $X\vert Y$." This $X\vert Y$ is not an event (as Henry noted in the comments). The vertical bar $\vert$ adds additional parameters to the probability operator and refers to conditions.
4

This can get quite philosophical fast. But, Judea Pearl's book Causal Inference for Statistics Section 1.3.3 provides a nice intuition, the operator $|$ implies a filtering of the data in the frequentist interpretation. An intuitive example would be two variables having bounds $P(X>a|Y<b)$, so conditioning implies filtering the data, i.e., removing parts of the data where condition $Y<b$ doesn't hold first.

Regarding if $X|Y$ is an event or not. It isn't an event in plain form but the resulting filtering operation leads to an event. Philosophical part of this then, what would be the Bayesian interpretation and even conditional probability can exist in isolation.

patagonicus
  • 2,540