Detecting fishy data

Question

Just a little thought I've been having. If we rolled a fair dice 60 times, and got 60 sixes in a row, we would (wrongly?) definitely assume that something fishy's going on. Is there any statistical measure which could raise our attention to this potential problem? Or is it simply not a problem, since this sequence is just as likely as any other sequence of 60 rolls (e.g. 10 occurrences of each of the six numbers, or 60 fives, etc.)?

I'm guessing we could use the weak law of large numbers to calculate what the probability of the average observation being different from the expected value of the dice, by a value of 2.5 in this case, after 60 trials is?

Related, Statistical argument for why 10,000 heads from 20,000 tosses suggests invalid data. — Andy W, May 16 '12 at 11:53
Where does the a priori notion of what is expected come from? This is often the role of the null hypothesis in hypothesis testing and "detecting fishy data" is equivalent to rejecting the null. In your example, you'd certainly start to think perhaps the die was not fair (i.e. reject the null hypothesis that it is fair) — Macro, May 16 '12 at 12:21

score 5 · Accepted Answer · answered May 16 '12 at 08:28

5

The probability of getting $n$ identical faces is $1/6^n$ if you fix the face first. Of course a long run is suspect.

These kinds of questions can be resolved with hypothesis tests. The null hypothesis is that the dice is fair, and the alternative hypothesis is that it is not. Then you set about finding the likelihood ratio and, depending on whether it is greater or less than a threshold, you accept or reject the null hypothesis. The procedure is detailed here.

Another relevant concept is the typical set from information theory, which says that a handful of sequences are overwhelmingly more likely to be observed than others.

answered May 16 '12 at 08:28

Emre

2,638

I confess to being confused by the reference to typical sets in this context. How does that explain the distinction between 60 sixes and any other 60-element sequence, given that all such sequences have the same probability? – whuber May 16 '12 at 14:01
That's if the die is fair; the proposition we are questioning. – Emre May 16 '12 at 17:06
Which means the null hypothesis is that of fairness: one makes all calculations of test statistics (such as frequency of observing a putative typical set) under that distributional assumption. – whuber May 16 '12 at 17:12
Yes, fair enough. – Emre May 16 '12 at 17:15

score 2 · Answer 2 · answered May 16 '12 at 14:28

I will try to give an alternative explanation, coming more from the Bayesian Side of things. It is true that every sequence is equally likely, including the one with just sixes.

I think the frequentist framework has some problems here. There are many possible ways to test the null hypothesis of independent throws from a fair dice, including maximal sequence length of every possible number from 1-6, the average of the throws, the distribution of ones, twos,... up to sixes or a complete comparision of the observed frequences with the theoretical frequencies via a a chi-square test.

But none of these tests detect all the possible deviations from independent + fair dice. Testing for more deviations at once will give you a problem with the multiple testing problem.

OTH the other hand in a Bayesian Framework you could argue that a series of all sixes is much more likely under the assumption that the dice is manipulated or the player is cheating. Given prior probabilities for "something is fishy" and a probability for "six dices if "something is fishy" you could update your probability for "something is fishy" based on the result via $$ P(\text{"fishy"|"all sixes"}) = \frac{P(\text{"all sixes"|"fishy"})P(\text{"fishy"})}{P(\text{"all sixes"})}. $$ Note that $$ P(\text{"all sixes"}) = P(\text{"all sixes"|"fishy"}) + P(\text{"all sixes"|"all ok with the die"}). $$

This means that for a fixed prior belief for "fishy" not equal to zero $P(\text{"fishy"}|\text{"all sixes"})$ is strictly increasing with $P(\text{"all sixes"|"fishy"})$. While it is in practice impossible to put a number on the last term, it is enough to give a lower estimate on it to get a lower estimate for the probability for "fishy". Obviously it's a bit subjective, but so is the judgement that all sixes is a particular strange result.

Detecting fishy data

2 Answers2