This is perhaps the question that some thought Grading scale: how to handle multiple choice questions with different number of choices really was. I'm of the opinion that multiple choice questions should not be used for summative assessment. However, I have no actual evidence to back up that view.
Has there been actual research on this, specifically with regards to mathematics?
I don't know if the use of multiple choice questions is different in mathematics to other subjects by dint of sheer ignorance in how other subjects are examined. I do know that often multiple choice questions in mathematics are not simply "Do you happen to know the answer?" questions where the route to the answer is irrelevant but the goal is more that by seeing the choice of answer one can deduce the route that they took and so there is no need to see the details (this assumes that they took a route and did not simply guess).
A sub-issue in this is the numerous schemes to discourage guessing. I'd happily also learn about any research as to the effectiveness of these. Again, my opinion is that they do not correct for the failings of the use of multiple choice questions but, again, I have no research-based evidence for this.
Let me conclude by re-emphasising that I am asking about summative assessment and not formative assessment and about actual research. I'm not interested in answers that are purely anecdotal or opinion-based (I'll happily hear those in another venue, though).
In thinking about MattF's comment about correlation, let me try to focus it more precisely. I have no doubt that multiple choice test scores are correlated with every other type of test. That doesn't speak to their efficacy and fairness though.
Consider the following two types of question:
Multiple choice, where the options include "Some of the above" and "All of the above".
No partial credit, but credit only given for an answer with reasoning.
In both, it is an "all or nothing". I've posited the extra options for the "multiple choice" variant to make it so that it isn't just a closed list to choose between and so to know the answer then a student ought to have worked it out. In the first, a student can guess. In the second, they can't.
My question, then, could be phrased as: how much effect does the fact that students can guess an answer have on a multiple choice test's ability to report students' abilities when compared with a "no partial credit" test?
The question "effective for what" still stands. Even if they do "measure what they measure" effectively, they may be measuring the wrong thing.
– JPBurke May 28 '14 at 20:00But to help me understand the type of research you're looking for: if you were going to research your own question, can you give me an example of what sort of analysis you would apply? For instance, what would you do to see the effect you conjecture may exist, and how would you attribute it to guessing? I ask because I'm not clear on what sort of researchable question this is. That's partly why I tried to address my earlier comments to what is known about math education.
– JPBurke May 28 '14 at 21:00