I have two algorithms that produce, for every observation, a vector of probabilities for 3 classes
| Algo 1 | Algo 2
-------------------------------------------------
Obs 1 | [0.5, 0.3, 0.2] | [0.3, 0.3, 0.4]
Obs 2 | [0.3, 0.2, 0.5] | [0.1, 0.1, 0.8]
Obs 3 | [0.2, 0.2, 0.6] | [0.2, 0.4, 0.4]
....
I want to compare how similar the algorithms are. One simple approach is to convert the probability vectors into the winning class and evaluate a simple Accuracy, i.e. # of match / # observations.
Is there a measure or statistical test that can compare the probability vectors?