0

I'm working in an unfamiliar Bayesian context here, so apologies if my terminology isn't entirely correct!

Imagine I'm trying to predict the performance of players on a of a five-a-side football team. The team has seven players, and in any given game only 5 play. My outcome variable is the goal difference in the match, and the input data is the players, plus other controls.

I can use dummy variables to create a binary variable for each player, and use them as a predictor in the model - but what I really want to do is take advantage of partial pooling to more efficiently predict the effect of player performance.

I'd know how to do this if there was only one player per match - then I'm just creating a multilevel model with 'player' as a categorical variable - but I don't know how (or if) I can do this when there are multiple players per game.

Does anyone have any thoughts about this or references I can read more at?

  • Are you trying to estimate how good the players are based on the results of the games where they did vs did not contribute? – George Savva May 17 '23 at 09:39
  • Hi George - yep, exactly. As I said in the post, I know I can do this in a simple model with binary variables but it feels like a multilevel approach would be more efficient and it irks me that I can't find anything on how to do it! – nwrdobrn May 17 '23 at 10:01

0 Answers0