How to compare performance of probability estimating model

Question

Suppose I have a model that estimates a discrete probability distribution over a set of classes/factors c1, c2, c3.

What are the options and appropriate use cases of measuring performance given a test set of feature vectors and their labels?

Few options that come to mind

Maximum likelihood: measure probability of observing the class labels in the test set under the given model;
Assume that, for example, a test vector with a class label of c1 represents a collapsed probability distribution $(1, 0, 0)$. Then use a similarity measure (or a loss function) to compare these probability distributions with the predicted probability;
Use a derived measure of expected utility $\mathbb{E}(U)$ based on application. Suppose further that we are modeling a game where the classes are possible outcomes of playing and there is an utility $U(c_i)$ associated with each class. Assume we would play whenever $\mathbb{E}(U) > 0$ in the model and our pay-off would be the utility of the actual class label or $0$ if we chose not to play. Then use pay-off over test set as a performance measure.

Is this question perhaps the same as the one at http://stats.stackexchange.com/questions/2275/how-can-i-determine-accuracy-of-past-probability-calculations? — whuber, Aug 06 '13 at 18:14

0 Answers0