How to compute $\eta^2$ in ANOVA by hand?

Question

This R code outputs the eta squared from an ANOVA:

y     <- c(rnorm(30, 3), rnorm(30, 4), rnorm(30, 5))
x     <- sort(rep(paste("treatment", 1:3), 30))
xy    <- data.frame(x,y)
xyaov <- aov(y ~ x, xy)

library(heplots)
etasq(xyaov)

          Partial eta^2
x             0.4807356
Residuals            NA

How to write a code to calculate the eta squared "by hand", i.e. without using the pre-prepared etasq function?

When you type etasq at the command prompt doesn't R show you the code? — whuber, Dec 06 '13 at 21:32
You can also peruse the documentation here. Can you clarify what you are after? — gung - Reinstate Monica, Dec 06 '13 at 21:33
No doesnt show code. Lets pretend I asked the same question about calculating the mean of y by hand. The answer I would want is sum(y)/length(y) — luciano, Dec 06 '13 at 21:40
Note eta-squared can be obtained also by summary(lm(y ~ x, xy))$r.sqso this question is equivalent of asking how to calculate r-squared by hand — luciano, Dec 06 '13 at 22:14

Jeremy Miles · Accepted Answer · 2013-12-06T23:26:59.530

It rather depends on what you mean by "by hand".

There is more than one way to do it. You can use the residuals:

> etasq(xyaov)
          Partial eta^2
x             0.4854899
Residuals            NA
> 1 - var(xyaov$residuals)/var(y)
[1] 0.4854899

(You didn't set a seed, so we don't have exactly the same result).

Almost equivalently, you can use the predicted values:

> var(predict(xyaov)) / var(y)
[1] 0.4854899

You can use the sums of squares from the ANOVA model (which is given by the rather unintuitive):

 > summary(xyaov)[[1]][[2]][[1]] / (summary(xyaov)[[1]][[2]][[2]] + summary(xyaov)[[1]][[2]][[1]] )
[1] 0.4854899

You can use summary.lm and get the R^2 (because R-squared is eta squared):

> summary.lm(xyaov)$r.squared
[1] 0.4854899

You can do it with no reference to the aov() function by calculating the mean for each group, then the residual, then eta squared based on that:

xy <- as.data.frame(cbind(x, y))
xy$y <- as.numeric(as.character(xy$y))  #I don't understand why this line is needed
x.means <- as.data.frame(tapply(y, x, mean))
x.means$x <- row.names(x.means)
    xy <- merge(x.means, xy, by="x")
    xy$resid <- xy[, 2] - xy$y
    1 - var(xy$resid) / var(xy$y)
[1] 0.4854899

gung - Reinstate Monica · Answer 2 · 2015-11-12T15:21:13.440

eta-squared ($\eta^2$), is a measure of effect size for ANOVA models that is analogous to $R^2$. That is, it gives the proportion of the variability in $Y$ that can be accounted for by knowledge of $X$. There is a 'regular' $\eta^2$, and a partial $\eta^2$. This distinction only comes into play when you have an ANOVA with multiple factors. Here are the formulas:
\begin{align} \eta^2_\text{(regular)} &= \frac{SS_\text{between}}{SS_\text{total}} \\[10pt] \eta^2_\text{partial} &= \frac{SS_\text{factor}}{SS_\text{factor} + SS_\text{error}} \end{align} For the latter, only a specific factor is implied, and the sums of squares associated with other factors in the model would not enter into the calculation.

For your example, the top formula would be used:

set.seed(55)
y     <- c(rnorm(30, 3), rnorm(30, 4), rnorm(30, 5))
x     <- sort(rep(paste("treatment", 1:3), 30))
xy    <- data.frame(x,y)
xyaov <- aov(y ~ x, xy)

anova(xyaov)
# Analysis of Variance Table
# 
# Response: y
#           Df Sum Sq Mean Sq F value   Pr(>F)    
# x          2 62.808  31.404  33.622 1.52e-11 ***
# Residuals 87 81.260   0.934                     
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

anova(xyaov)[1,2]/sum(anova(xyaov)[,2])
[1] 0.435961

How to compute $\eta^2$ in ANOVA by hand?

2 Answers2

Linked