My question is similar to Testing difference in kurtosis between two samples where a comment suggested
Unless you are looking for an enormous difference in kurtosis, it's unlikely any physically realizable sample size will produce significant results.
Either I am doing something wrong, or I have an enormous difference.
I have data from an experiment (N = 20,000) where participants completed a behavioral task. There are two approximately equal sized groups. Group A is a control group and group B was exposed to something that is hypothesized to make the really good performers better, the really bad performers worse, and have no effect on average performers. While the field is in agreement with the hypothesis, nobody has really defined what a really good and bad performers is. We often talk about it as someone who is in the top or bottom 10%. While we would love to have longitudinal data, we only have a single data point for each subject (i.e., after group B was exposed).
To me this sounds like I want to test for differences in the kurtosis (i.e., are the tails in group B heavier than the tails in group A). So far I have resampled the data (this is how I approach most stats problems). I chose 20,000 subjects, with replacement, at random and divided them into their groups. I then calculated the kurtosis of the two distributions (I also calculated the mean, standard deviation, and skewness along with the 5, 25, 50, 75, and 95 percentiles of the distributions) and took the difference. I repeated this 10,000 times. The mean difference of the kurtosis is 3.5 with a standard deviation of 0.8. None of the values of the difference of the kurtosis is less than zero. This makes me think that there may be a statistically reliable difference between the groups.
What is a better way to test if there is a difference between the two groups?
QQ plot: Lower performance is better. The task is nominally bounded on both ends, but essentially no subjects perform at floor (-7) while 5% of subjects perform at ceiling (-30). Since the unexposed group (group A) is already at ceiling, I am not sure we will see the improvement due to the exposure of Group B.
