How do I get an accurate probability when my program that tests in the millions returns significantly different results each time?

Question

Apologies in advance, I have never (seriously) studied statistics and I'm out of my depth, so this may not be a simple question.

I wrote a program in java to determine the likelihood of a Tenhou in Mahjong - it's basically just a very rare hand. I set it to run a series of tests simulating a hand each time, a million at a time. Basically, it deals out tiles - if it's a Tenhou, that's a pass, otherwise it fails. Sometimes, there is only one Tenhou out of a million. Other times, there are 10 or more out of a million. If I run it for say, 50 million, the results still vary quite a bit.

I'd like to get an accurate probability. Is there some kind of formula or principle that I can apply? Even if I could just get a link to an article to read, that would be helpful - I just don't even know where to start.

Edit: Imagine you are playing 5 card poker. You are dealt five cards. If it's not a royal flush, shuffle the cards and try again. That's essentially all my program is doing, but it's doing it one million times, several times.

To be more specific - it shuffles the 136 tiles in a Mahjong set, deals out 14 of them. If those 14 don't constitute a winning hand, shuffle them all again and deal out a new hand of 14. Do that a million times, several times. So, it's not simulating a real game with many different variables - there's just the one variable (the hand that is dealt) and one parameter (condition for a winning hand).

My issue is that when running that same process over and over, the outcome sometimes varies by a factor of 10 or more - that is, there might be one winning hand out of one million dealt, or there may be ten or more winning hands out of one million dealt. My question is, how do I reconcile a spread that large? Please let me know if more clarification is needed.

I don't know anything about Mahjong beyond that it is a popular game. But, if it is a complicated game then there probably isn't a direct formula for it. With that in mind, a Monte Carlo approach as you have reported sounds like a reasonable way to go. What you are discovering is that there is a sampling distribution associated with estimating the probability of Tenhou in this way. — Galen, Jun 23 '21 at 23:52
You might try finding the maximum likelihood estimate of the probability of Tenhou from the binomial distribution where each game is treated as a Bernouli trial. — Galen, Jun 23 '21 at 23:54
I expect you will be more likely to get a useful answer if you provide more details. Is a tenhou produced simply by randomly picking tiles, and seeing if the first $n$ chosen tiles satisfy some property? — jkpate, Jun 24 '21 at 00:15
Unclear whether this is about statistics or about programming a particular simulation. It can be difficult to get useful answers for very rare events by simulation. — BruceET, Jun 24 '21 at 00:17
@Galen I edited my post to provide more information. I admit I have no idea what your second comment means. — jermriddled, Jun 24 '21 at 00:38
@jkpate I edited my post to provide more information, and yes, my program runs exactly as you described — jermriddled, Jun 24 '21 at 00:40
@jermriddled MLE of success of a trial with binomial distribution is something I recommend you read into. — Galen, Jun 24 '21 at 00:49
I just recalled that I did something vaguely related to this some time ago with chess. Feel free to consider whether the continuity-corrected confidence interval I used is appropriate for your use case. — Galen, Jun 24 '21 at 00:59
If there is a specific tile that needs to be in the hand, then you could sample from a space with a higher probability of Tenhou by only considering cases where that crucial tile is in the top 14 tiles. In this situation, you then compute the probability of that tile being in the first 14, and back into the unconditional probability. // One problem you may encounter with pRNG is that you might exhaust the period of the random generator before you're even close to exploring the space of all $136!$ orderings of tiles, so you're deterministically exploring the same orders again. — Sycorax, Jun 24 '21 at 01:03
Of course... if you're doing the work of figuring out the probabilities directly, it appears that this has some treatment here https://boardgames.stackexchange.com/questions/908/in-mahjong-what-is-the-chance-to-get-a-hand-that-allows-a-player-to-finish-on — Sycorax, Jun 24 '21 at 01:07
You might also want to look at http://arcturus.su/wiki/Tenhou_and_chiihou#Probability — Henry, Jun 24 '21 at 01:22
You could test your simulation coding by including estimation of some more probable events! — kjetil b halvorsen, Jun 25 '21 at 22:48

BruceET · Answer 1 · 2021-06-24T06:53:16.153

Suppose we want to know the probability of getting 4 Aces in a 5-card poker hand dealt from a standard American deck. By easy combinatorics that's $48/{52\choose 5} = 48/2\,598\,960 = 1.846893e-05$ or a little less than $2$ chances in $100\,000.$

dhyper(4, 4,48, 5)
[1] 1.846893e-05

It is easy to write a program in R to simulate a million hands, but difficult to get an approximation by simulation that has small relative error.

set.seed(2021)
deck = c(1,1,1,1, rep(0,48))
nr.a = replicate(10^6, sum(sample(deck,5)))
mean(nr.a==4)
[1] 2e-05        # aprx 0.000018
2*sd(nr.a==4)/1000
[1] 8.944187e-06 # aprx 95% margin of sim error

So we get $0.00002 \pm 0.00001.$ This interval contains the correct value, but the margin of simulation error is about half as large as the quantity being simulated.

Two additional runs give:

nr.a = replicate(10^6, sum(sample(deck,5)))
mean(nr.a==4)
[1] 1.9e-05
nr.a = replicate(10^6, sum(sample(deck,5)))
mean(nr.a==4)
[1] 2e-05

My simulations are as accurate as expected, but are not giving useful approximations. If I had the patience to do simulations with 10 or 100 million iterations, I would do somewhat better, but the probability of getting 4 aces in a poker hand is not something I would choose to simulate.

I don't know the probability of a Tenhou in Mahjong, but I suspect you may have the same kind of difficulty trying to approximate that probability by simulation.

Note: A Google search fetched the sentence: "A quick division of the two values shows that the probability of Tenhou or Chihou is around 3.982 × 10–12 or somewhere around 1 in two trillion." (Like anything on the Internet, it may or may not be true.) Presumably this is from one of several articles listed. Perhaps this one.

How do I get an accurate probability when my program that tests in the millions returns significantly different results each time?

1 Answers1