1

Context:
I am teaching a subject and I prepared a multiple-choice quiz for my students. To have a feeling for which is an acceptable grade I decided to compute a baseline score, which is the score that an agent that answers completely at random would get on the same quiz. For that, I ran a Monte Carlo simulation and generated N scores for that agent.

Now my goal is to find a value (i.e. a score) that gives me enough confidence/evidence to make me say: "this student did not answer at random". For that, I wanted to compute a 95% confidence interval on these values.

Question:
The question is: in this case does it make more sense to compute the one-sided confidence interval or the two-sided? What each of them does actually tell me? I am not able to really tell which would be the difference for me between these two options.

Also, I realized that if I compute the two-sided confidence interval, the lower boundary is always 0, which means that that 2.5% fraction on the left is all 0s. Does this affect the decision?

I feel like I am not really interested in the lower boundary, but on the other hand if I do a one-left CI, the upper boundary is lower which doesn't really make sense to me.

rusiano
  • 564
  • 2
  • 14
  • 2
    Why are you doing a Monte Carlo simulation at all? If you have a multiple choice exam, there are analytic solutions to this. Also, I would question that guessing at random has much to do with an "acceptable score". – Peter Flom Mar 10 '24 at 10:41
  • How many questions does the quiz have, how many answers are there per question and how are they scored? – COOLSerdash Mar 10 '24 at 10:43
  • @PeterFlom because it is a simulation and it is easier for me since I am more a programmer than a mathematician. The score of the "random agent" is not the acceptable score, his score would be the baseline and I would expect students to do better than that. – rusiano Mar 10 '24 at 10:46
  • @COOLSerdash It does not really matter, I want a solution that applies to any quiz I might make. However I am considering a scenario in which there are between 10 and 15 questions with 4-5 options each and only 1 correct answer that gives a positive score, while the incorrect options simply score 0. – rusiano Mar 10 '24 at 10:48
  • 3
    You are asking for a quantile of the distribution of random scores: it is not a "confidence interval." The latter is computed from a random sample, which is not an issue here. // Your question is answered in several threads, such as https://stats.stackexchange.com/questions/41247/risk-of-extinction-of-schr%c3%b6dingers-cats, which gives the full distribution of correct scores, even when there are different numbers of possible responses in each question. When the number of responses is constant, that is a Binomial distribution. – whuber Mar 10 '24 at 14:19

0 Answers0