Two raters assessed the risk of bias in 3 randomised controlled trial (RCT) research articles using a standard critical appraisal checklist. The checklist has 10 questions addressing key risks of bias in RCTs. Response to questions was dichotomous Yes/No. This gave me a 2 (rater1/rater2) by 3 (articles 1-3) by 10 (Yes/No) matrix of data. What would be the best method to assess agreement between the raters. Can I just calculate kappa for each and then calculate the mean of the 3 kappas? Is there something better that can model variation in agreement between raters across articles and items? Below is some made up illustrative data. 1=Yes, 2=No. Thank you.
Asked
Active
Viewed 29 times
1
-
You might look into generalizability theory. Huebner, A. & Lucht, M. Generalizability theory in R. Practical Assessment, Research & Evaluation (in press). – Jeffrey Girard Jan 10 '20 at 14:21
