0

I have a sample of around 100 that has been annotated as either 0, 0,5, or 1 by 5 annotators. For some reason, I found that some of the annotations need to be deleted. Thus, for some samples, we have less than 5 annotations.

If there were no missed annotations, I'd use the Fleiss kappa test to measure the inter-annotators agreement. But since we have missed cases, is there a way to measure the agreement even though we miss some annotations?

Minions
  • 145

1 Answers1

0

You just need to use a formulation of Fleiss' kappa (or another chance-adjusted index of categorical agreement) that allows for missing data. If you want to use Python, see the irrCAC library.