4

Consider this example:

team <- rep(c("A","B","C"), times=c(7,4,10))
trip <- rep(NA,length(team))
for(i in 1:length(unique(team))){
  trip[which(team==unique(team)[i])] <- 1:days[i]
}
obs  < -c(rnorm(days[1],100,30), rnorm(days[1],100,5), rnorm(days[1],100,15))
data <- data.frame(team, trip, obs)

boxplot(obs~team, data)

It is pretty clear that the variance in each team is different, but the mean is similar.

How can I infer this statistically? How can I compare intra-group (within-group) variance with the inter-group (between-group) variance?

IosuP
  • 41
  • 1
  • 1
  • 3
  • I have no idea what you're asking/trying to do.. but intra group and inter group variance are related as follows.

    Var(Y)= E[Var(Y|X)]+Var(E[Y|X])

    Interpretation: Let Y be player skill and X be dummies for teams. This says that the variance in player skill is the average within-team variance of skill plus the variance across teams of average within-team skill. IN your case, if the team means are similar, the second term is small.

    – CloseToC May 23 '14 at 13:48

2 Answers2

3

I'm not sure if I completely follow your thinking, but I can update this if it isn't what you're looking for.

In general we compare the intra-group variance with the inter-group variance with the ANOVA. (By the way, the more usual terms are within-group variance and between-group variance.) The standard use of the ANOVA may not be what you are after, though. It is used, ultimately, to determine if the means are the same (that's the inter-group variance part). In addition, it assumes that the group variances (intra-) are the same in order for the check of the inter-group variance to be valid.

If you want to know if the intra-group variances are the same, you can use Levene's test, which is an ANOVA on the absolute differences of each point from its group mean. In R, the function is leveneTest(obs, team, center=mean) in the car package (documentation). For a slightly more robust version, you can get the absolute value of the differences from the median and run an ANOVA on them instead. In that case, it is called the Brown-Forcythe test. That is actually the default for leveneTest (ironically), so you can just drop the center=mean part. I discuss these tests here: Why use Levene's test of equality of variance rather than F ratio?

If you believe you have substantial heteroscedasticity, but want to test if your means differ as well, there are a number of methods available. I discuss them here: Alternatives to one-way ANOVA for heteroscedastic data.

  • Thank you very much.. I forgot about the basic assumptions of ANOVA. And thanks also for your last link!!! – IosuP May 23 '14 at 15:16
  • You're welcome, @user46061. – gung - Reinstate Monica May 23 '14 at 15:21
  • @gung-ReinstateMonica You said ANOVA is to compare intra-group variance with the inter-group variance, but then said that the ultimate use is to compare means. What if we just want to test if two variances are statistically different? Is Levene's test sufficient to test that? Or can Levene be used first to test that there is a difference, and then some other test can be used to answer "how much different are the variances?" – wxz Feb 28 '24 at 19:28
  • As an example, if I have NBA players shoot 100 times from the free throw line once a day and calculated daily free throw %, I'd expect each player to have a low level of variance between days (i.e. each player consistently shoots around their same percentage every day). But if I plotted all players daily shot percentages as individual data points, the overall variance in this dataset would be high because some nba players shoot better than others. How do I analyze that kind of difference? – wxz Feb 28 '24 at 19:35
  • @wxz, you should really ask that as a new question, not have it buried here in comments. The ANOVA uses variances, but does so in order to test for a difference in means between groups. You can use an F-test to compare two variances, but it isn't a good way to do it. Levene's test would be the standard recommendation. I'm not sure your example fits that situation, though. If that situation is the one you are really interested in, it's not a good fit, because there are several nuances that need to be unpacked. You should really ask a new question; you can link back to this, if you want. – gung - Reinstate Monica Feb 28 '24 at 20:38
  • @gung-ReinstateMonica Thanks for the help. I created a new question for this. I also changed the example hoping to make the question clearer. – wxz Feb 28 '24 at 21:08
1

I'm not sure if this is what you're asking, but Levene's Test and a few similar tests in that family test if your groups have the same variance (test for homoscedasticity). So if Levene's test is significant, you reject the hypothesis that the variance of the populations you're sampling from is the same.

jona
  • 1,824