1

I am testing a tool that tries to select the correct outcome. I am trying to do significance testing to see if the tool is better then choosing the outcome at random.

It picks from 4 categories, and I have a list of the correct category, and the one the tool picked.

What test should I use?

Adam
  • 13

1 Answers1

0

Consider that you're interested in whether the proportion of times correct is greater than the proportion you'd reasonably get under random guessing (presumably with equal probability on each outcome).

With 4 outcomes therefore, the chance you get it by random guessing would be $\frac14$. Assuming independence of trials, the number of correct guesses under the null hypothesis would be $\text{binomial}(n, \frac14)$, where $n$ is the number of trials (attempts at guessing); this leads to a binomial test (see the example at the link involving testing whether a die rolls too many 6's, at heart the same problem as yours with a different number of outcomes).

If $n$ is large you could use a normal approximation (leading to the typical one-sample proportions test covered in a non-mathematical introductory stats text), but more generally you can base a test directly off the binomial.

Presumably from the way the question was phrased, you seek a one-tailed test.

[Alternatively, in place of the normal approximation to the binomial, you could perform a chi-squared goodness of fit test (two outcomes, with probabilities 1/4 and 3/4 under the null), but this would prevent doing a one-tailed test.]

Glen_b
  • 282,281