12

I have a question about Bayesian updating. In general Bayesian updating refers to the process of getting the posterior from a prior belief distribution.

Alternatively one could understand the term as using the posterior of the first step as prior input for further calculation.

The below is a simple calculation example. Method a is the standard calculation. Method b uses the posterior output as input prior to calculate the next posterior.

Using method a, we get P(F|HH) = 0.2. Using method b, gives P(F|HH) = 0.05. My question is as to how far method b is a valid approach ?


Problem: You toss a coin twice, get 2 Heads. What is the probability that the coin is fair, i.e. $Pr(Fair\ coin| HH)$?

Now for the first toss: $Pr(Fair\ coin| H) = \frac{Pr(Head|Fair)\cdot P(Fair)}{Pr(Head|Fair) \cdot P(Fair)+Pr(Head|Biased) \cdot P(Biased)} = \frac{Pr(H|F)\cdot P(F)}{P(H)} \quad\quad (1)$

Assuming starting prior belief P(Fair) = 0.5, want to find P(F|H) for the first toss

Below are the calculation for the intermediate steps:

$P(H|F)= {n \choose x} \theta^{x}(1-\theta)^{n-x} = {1 \choose 1} 0.5^{1}(0.5)^{0}= 0.5$

$P(H)= P(H|F) \cdot P(F)+ P(H|Biased) \cdot P(Biased)=(0.5 \cdot 0.5) +(1 \cdot 0.5) = 0.75$

(Note: P(H|Biased) = 1 because assuming an extreme example with Heads on both sides of the coin, the probability of getting Heads with a biased coin = 1 (makes calculation easy))

Hence, plugging into (1), we get :

$Pr(F| H) =\frac{Pr(H|F)\cdot P(F)}{P(H)} = \frac{0.5 \cdot 0.5}{0.75} = 0.33$


Now, we toss the coin again and get another H. To calculate $Pr(F| HH) $ , we

a) continue using P(Fair)=0.5

$Pr(F|HH) = \frac{Pr(HH|F)\cdot P(F)}{Pr(HH|F) \cdot P(F)+Pr(HH|Biased) \cdot P(Biased)} = \frac{Pr(HH|F)\cdot P(F)}{P(HH)} \quad\quad (2)$

$P(HH|F)= {n \choose x} \theta^{x}(1-\theta)^{n-x} = {2 \choose 2} 0.5^{2}(0.5)^{0}= 0.25$

$P(HH)= P(HH|F) \cdot P(F)+ P(HH|Biased) \cdot P(Biased)=(0.25 \cdot 0.5) +(1 \cdot 0.5) = 0.625$

Hence, plugging into (2), $Pr(F|HH) =\frac{Pr(HH|F)\cdot P(F)}{P(HH)} = \frac{0.25 \cdot 0.5}{0.625} = 0.2$


Alternatively, what if we calculate $Pr(F| HH) $ by using

b) our updated belief P(Fair)=0.33 which we got from Pr(F|H) in the first step

In this case,

$P(HH|F)= {n \choose x} \theta^{x}(1-\theta)^{n-x} = {2 \choose 2} 0.33^{2}(1-0.33)^{0}= 0.1089$

$P(HH)= P(HH|F) \cdot P(F)+ P(HH|Biased) \cdot P(Biased)=(0.1089 \cdot 0.33) +(1 \cdot 0.67) = 0.705937$

Hence, plugging into (2), $Pr(F|HH) =\frac{Pr(HH|F)\cdot P(F)}{P(HH)} = \frac{0.1089 \cdot 0.33}{0.705937} = 0.05091$


Using method a, we get P(F|HH) = 0.2. Using method b, gives P(F|HH) = 0.05. My question is as to how far method b is a valid approach ?

TinaW
  • 393
  • How did you reason that the probability of getting heads given a biased coin is 1? Anyway, no matter how many times you flip the coin, the probability that it is fair is zero. – Neil G Nov 05 '16 at 18:50
  • If the coin is biased and we see a Head it means the coin has Head on both sides. We will always see Head, so probability of getting Head with a biased coin=1. – TinaW Nov 05 '16 at 18:59
  • 1
    Usually a biased coin just means that it's not fair, but it could have any bias. You should make it clear in your question that you're only considering the possibilities that either the coin is perfectly fair, or else it always comes up heads. You might want to recast this as an urn problem since that's not a very realistic coin. – Neil G Nov 05 '16 at 19:01
  • Indeed, a coin is either totally fair or totally biased. If it was a dice, there would be different scenarios of bias. For a coin, there's only 1 option. – TinaW Nov 05 '16 at 19:03
  • 1
    "Indeed, a coin is either totally fair or totally biased." — Not really. There are very very few coins that either totally fair or "totally biased". – Neil G Nov 05 '16 at 19:04
  • Are you pointing to the option that although there's Heads on one side and tails on the other, the metal on one side is heavier than the other and hence tends to fall more on a particular side? Fine, I'm happy with an alternate calculation for the P(H|Biased). However, the focus of the question is about using priors recursively from previous posterior results. It's widely used from what I see. – TinaW Nov 05 '16 at 19:08
  • Most coins have a bias, some number, like 0.56. That's a biased coin. It falls head 56% of the time and tails 44% of the time. Your biased coin falls heads 100% of the time. How would design a coin like that anyway? – Neil G Nov 05 '16 at 19:09
  • 1
    both sides heads – TinaW Nov 05 '16 at 19:10
  • Fair enough. But that's not what people usually mean when they say "biased coin". – Neil G Nov 05 '16 at 19:11
  • No problem. The focus of the question is about using priors recursively from previous posterior results. It's widely used from what I see. – TinaW Nov 05 '16 at 19:11
  • Yes, I see. If you do this in log-odds space, then the updates are additive and this is much easier. Your first approach seems right: 1:1 after no observations, then 2:1, then 4:1, etc. Something's wrong with your second approach. Good luck. – Neil G Nov 05 '16 at 19:14
  • Interesting, Thanks. Yes, the first approach is standard and not updating the prior. It just seems inefficient not to make use of the updated information from the first toss. – TinaW Nov 05 '16 at 19:23

1 Answers1

11

Your approach b) is wrong: both the single step updating, in which all data are used together to update the prior and arrive at the posterior, and the Bayesian sequential (also called recursive) updating, in which data are used one at a time to obtain a posterior which becomes the prior of the successive iteration, must give exactly the same result. This is one of the pillars of Bayesian statistics: consistency.

Your error is simple: once you updated the prior with the first sample (the first "Head"), you only have one remaining sample to include in your likelihood in order to update the new prior. In formulas:

$$P(F|HH) =\frac{P(H|H,F)P(F|H)}{P(H|H)} $$

This formula is just Bayes' theorem, applied after the first event "Head" has already happened: since conditional probabilities are probabilities themselves, Bayes' theorem is valid also for probabilities conditioned to the event "Head", and there's nothing more to prove really . However, I found that some times people don't find this result self-evident, thus I give a slightly long-winded proof.

$$P(F|HH) =\frac{P(HH|F)P(F)}{P(HH)}= \frac{P(H|H,F)P(H|F)P(F)}{P(HH)}$$

by the chain rule of conditional probabilities. Then, multiplying numerator and denominator by $P(H)$, you get

$$\frac{P(H|H,F)P(H|F)P(F)}{P(HH)}=\frac{P(H|H,F)P(H|F)P(F)P(H)}{P(HH)P(H)}=\frac{P(H|H,F)P(H)}{P(HH)}\frac{P(H|F)P(F)}{P(H)}=\frac{P(H|H,F)}{P(H|H)}\frac{P(H|F)P(F)}{P(H)}=\frac{P(H|H,F)P(F|H)}{P(H|H)}$$

where in the last step I just applied Bayes' theorem. Now:

$$P(H|H,F)= P(H|F)=0.5$$

This is obvious: conditionally on the coin being fair (or biased), we are modelling the coin tosses as i.i.d.. Applying this same idea to the denominator, we get:

$$P(H|H)= P(H|F,H)P(F|H)+P(H|B,H)P(B|H)=P(H|F)P(F|H)+P(H|B)P(B|H)=0.5\cdot0.\bar{3}+1\cdot0.\bar{6}$$

Finally:

$$P(F|HH) =\frac{P(H|H,F)P(F|H)}{P(H|H)}=\frac{0.5\cdot0.\bar{3}}{0.5\cdot0.\bar{3}+1\cdot0.\bar{6}}=0.2$$

QED


That's it: have fun using Bayesian sequential updating, it's very useful in a lot of situations! If you want to know more, there are many resources on the Internet: this is quite good.

DeltaIV
  • 17,954