Is this a geometric distribution problem?

Question

Suppose a student starts with test A, and will proceed to test B, then test C if he passes.

The probability for the student to pass test A is 30%. The probability for the student to pass test B is 20%. The probability for the student to pass test C is 10%.

The student has maximum 20 tries to attempt for the tests in TOTAL.

How do I calculate the probability of the student to pass test C, starting from test A, within 20 tries? My hunch is using geometric distribution but I am rather unsure since this problem involves multiple different "stages" with each stage having different probabilities.

Clarification: If the student passes test A, he does not have to take A again, and can use the remaining attempts (20 minus attempts used to pass A) for B and C. Same goes for B.

Can you clarify: it sounds like the student gets 20 attempts at the sequence A-> B -> C (so the student would be able to attempt A up to 20 times plus some attempts at B and C as passing allows, meaning as many as 60 tests could be attempted in the right circumstances)... or did you mean that the student can take 20 tests in total, being some combinations of A's, B's and C's as the passing permits. — Glen_b, Aug 25 '22 at 12:53
Is there an assumption of independence which you didn't write? If the student attempts test A several time, is the probability of succeeding A the second time, conditional to the fact that they failed the first time, still 30%? Same question for B and C, and for success of B conditional of success of A, etc. — Stef, Aug 25 '22 at 13:06
Note that the different answers that were posted follow different interpretations of your problem. This is because the situation you described is quite ambiguous. Please rephrase the description of how this exam system works! — Stef, Aug 25 '22 at 13:25
To clarify: If the student passes test A, he does not have to take A again, and can use the remaining attempts (20 minus attempts used to pass A) for B and C. Same goes for B. @Stef — user7381027, Aug 25 '22 at 13:26
@user7381027 Well, in that case, user2974951 's answer with the Markov chain is the only correct one. — Stef, Aug 25 '22 at 13:28

user2974951 · Accepted Answer · 2022-08-25T13:17:28.270

1

This can be solved using Markov chains. First define your 4x4 transition matrix (A, B, C, F - final state), probabilities of passing from one state to the next. I am assuming there is only one direction of progression, forward, i.e. there is no rollback to previous tests upon failure. If this is not true then the matrix below can be slightly changed to account for this.

    A   B   C   F
A 0.7 0.3 0.0 0.0
B 0.0 0.8 0.2 0.0
C 0.0 0.0 0.9 0.1
F 0.0 0.0 0.0 1.0

Then you need to raise this matrix to the power of 20 (19), which gives you the probabilities of being in a certain state depending on where you started after 20 turns. The result is

             A          B         C         F
A 0.0007979227 0.03219388 0.2979484 0.6690598
B 0.0000000000 0.01152922 0.2200949 0.7683759
C 0.0000000000 0.00000000 0.1215767 0.8784233
F 0.0000000000 0.00000000 0.0000000 1.0000000

Here you would look at the first row, since you started in state A. The probability of being in state F (the final state) starting in state A after 20 turns is roughly equal to 67 %.

edited Aug 25 '22 at 13:17

answered Aug 25 '22 at 12:51

user2974951

7,813

I seem to understand that the student always fall back to state A when they fail an exam. – Stef Aug 25 '22 at 13:08
@Stef I understood differently, these are independent, there is no rollback to A. – user2974951 Aug 25 '22 at 13:08
Re-reading the question, both interpretations seem plausible. Since the question is ambiguous, perhaps you could be specific about your interpretation in your answer? Especially since your answer gives a method that works with both interpretations - only the matrix has to be changed to fit one or the other. – Stef Aug 25 '22 at 13:15
Anyway, I upvoted. The OP already identified that their problem involves "multiple stages", and it's great to show them that Markov chain is the go-to tool for this kind of situation. – Stef Aug 25 '22 at 13:16
(My interpretation also comes from the OP's suggestion that this might be related to a geometric distribution: if the student has 20 attempts at A->B->C->F re-starting from A at every failure, then there is indeed a geometric distribution there; but if the student never goes back to A then there is no geometric distribution anywhere in this problem) – Stef Aug 25 '22 at 13:23

score 0 · Answer 2 · answered Aug 25 '22 at 12:20

0

No.

Since the probabilities are different at each time point, this data will not follow a geometric distribution.

I also don't quite understand when you say "The student has maximum 20 tries to attempt for the tests in TOTAL." bit, but that's not necessary to answer your question.

answered Aug 25 '22 at 12:20

Eoin

8,997

Thanks for the reply. What I meant was that the student has 20 attempts in total. So I need to calculate the probability of the student passing all 3 tests within this 20 attempts. Could you point me in the right direction? – user7381027 Aug 25 '22 at 12:25

Glen_b · Answer 3 · 2022-08-25T23:30:11.027

If we call "an attempt" a single try at tests A, and then B and then C in turn if the preceding test is passed; an attempt fails if any of A, B or C fail.

The probability of passing all three is $0.3 \times 0.2 \times 0.1 = 0.006$. In short P(attempt succeeds) = 0.006.

In that case, the probability of succeeding at least once in 20 attempts could be done from first principles:

P(succeed at least once) = 1-P(don't succeed even once)
= $1 - 0.994^{20} = 0.1134 = 11.34\%$

Or it would typically be calculated using probabilities from a binomial distribution.

The geometric is used when calculating the distribution of the number of trials to the first success; however, you're correct to suppose that it can be used to solve this problem; if the first success occurs after trial $k$ there was no success by trial $k$. So you're just doing a calculation in the other tail. Since tail sums of geometric p.f.s are simple, this is quite doable -- but a little more effort than a more direct calculation in this instance.

Is this a geometric distribution problem?

3 Answers3