0

Let's assume I am trying to find out whether individual people are better at beer pong while sober or drunk. In this experiment I do four trials. On Monday, I first have them throw 100 times sober and record their results. Then, I have them drinks x number of beers and throw another 100 and record their results. I then repeat the experiment the following Monday. So, I have two independent datasets that look like this:

People Sober Drunk Drunk - Sober
A 25 20 -5
B 52 75 23

where there are ~20,000 people shared between datasets.

With this data, what statistics would I perform to select people who show a significant increase or decrease in skill while drunk? My initial thought is some sort of rank test, but I can't convince myself of a method. I'm a probability newbie and am not aware of many of the methods.

spleen
  • 1
  • So can we say 10k people took part in experiment A and 10k other people participated in experiment B? And you are not interested in average differences in outcome, but you want to identify specific "overachieving" outliers (individual datapoints)? – confused student Jan 07 '22 at 18:47

1 Answers1

0

Regression is usually a good first approach. I think it satisfies all of your questions. In this case we will do a generalized linear mixed model to account for the replication for each person.

Here is a simulation to make some data (I just used n=100 but you can also do this with n=20000).

library(MASS)
library(tidyverse)
library(lme4)

set.seed(123)

n<-100 #simulate data

#person effects p <- mvrnorm(n, mu = c(0,0), Sigma = matrix(c(1, 0.6, 0.6, 1), 2, 2) ) # from MASS package

#population effects alpha<- -3 trt <- -1

df<-rbind(data.frame(person=1:n, trt="sober", rep=1, result = rbinom(n, size=100, prob=plogis(alpha+ p[,1]))), data.frame(person=1:n, trt="drunk", rep=1, result = rbinom(n, size=100, prob=plogis(alpha+ p[,1] + trt + p[,2]))), data.frame(person=1:n, trt="sober", rep=2, result = rbinom(n, size=100, prob=plogis(alpha+ p[,1]))), data.frame(person=1:n, trt="drunk", rep=2, result = rbinom(n, size=100, prob=plogis(alpha+ p[,1] + trt + p[,2])))) df$trials <- 100 df$person <- as.factor(df$person) df$trt <- factor(df$trt, levels=c("sober","drunk"), labels=c("sober","drunk"))

Above was just to simulate the data. I set the population intercept alpha to -3 and the population treatment effect trt to -1. We fit the model

#model
m<-glmer(cbind(result, trials-result) ~ trt + (1+trt|person),data=df, family=binomial)

Lets take a look at how well it recovered the parameters

Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
 Family: binomial  ( logit )
Formula: cbind(result, trials - result) ~ trt + (1 + trt | person)
   Data: df
 AIC      BIC   logLik deviance df.resid 

1917.3 1937.3 -953.6 1907.3 395

Scaled residuals: Min 1Q Median 3Q Max -1.8100 -0.6476 -0.1752 0.4740 1.9488

Random effects: Groups Name Variance Std.Dev. Corr person (Intercept) 0.9582 0.9789
trtdrunk 0.7849 0.8860 0.61 Number of obs: 400, groups: person, 100

Fixed effects: Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.9202 0.1049 -27.832 < 2e-16 *** trtdrunk -0.9678 0.1186 -8.163 3.27e-16 ***


Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects: (Intr) trtdrunk 0.321

We can see that it estimated the population level effects well. Next lets rank the persons by their probability to make one shot sober and their probability to make one shot drunk

#predictions
preddf <- expand.grid(person=levels(df$person), trt=levels(df$trt))
preddf$pred <- predict(m, newdata=preddf, type="response")

#top sober players preddf %>% filter(trt == "sober") %>% slice_max(pred, n=10) #top drunk players preddf %>% filter(trt == "drunk") %>% slice_max(pred, n=10)

If you are interested strictly in the persons whose skill increases the most you can look at

preddf %>% pivot_wider(names_from=trt, values_from=pred) %>% mutate(diff=drunk-sober) %>% slice_max(diff, n=10) 

You could probably arrive at a ranking that is pretty similar by just looking at the data

df %>% pivot_wider(names_from=trt, values_from=result) %>% mutate(diff=drunk-sober) %>% group_by(person) %>% summarize(diff_mean = mean(diff)) %>% slice_max(diff_mean, n=10)

but in your question you also asked about "significant" increases. Essentially you want to put some kind of 'confidence' interval on these point estimates. This is a lot easier if we estimate the same model in a Bayesian setting (I won't go into the details of prior choices, we'll go with the defaults in brms)

library(brms)

top_persons <- df %>% pivot_wider(names_from=trt, values_from=result) %>% mutate(diff=drunk-sober) %>% group_by(person) %>% summarize(diff_mean = mean(diff)) %>% slice_max(diff_mean, n=10) %>% dplyr::select(person) %>% deframe() m_bayes <-brm(result | trials(trials) ~ trt + (1+trt|person),data=df, family=binomial)

post_diff <- as.data.frame(plogis(fixef(m_bayes, summary=F)[,1] + ranef(m_bayes, summary=F)$person[,,1] + fixef(m_bayes, summary=F)[,2] + ranef(m_bayes, summary=F)$person[,,2]) - plogis(fixef(m_bayes, summary=F)[,1] + fixef(m_bayes, summary=F)[,2]))

Here is the posterior distribution of the increase in success probability while drunk of the players with the 10 highest increases:

post_diff %>% mutate(iter = 1:n()) %>% pivot_longer(cols=-iter) %>% 
  filter(name %in% top_persons) %>% 
  ggplot(aes(y=fct_reorder(name, value), x=value)) + stat_halfeye()

enter image description here

You can see for these top 10 their posterior probabilities of skill increase are all well above zero.

bdeonovic
  • 10,127