3

I have a small sample size ($n = 23$) and I ran some t-tests to see if there were any group differences (group one had $11$ participants and group two had $12$ participants). I can wrap my head around small samples not being sufficiently powered to detect statistically significant results (and there were none at $0.05$), but I did get some large effect sizes with Cohen's $d$ and moderate effect sizes with SE Cohen's $d$ across some of the comparisons. I don't know how to make sense of those. I believe effect sizes are independent of sample size, but I can't articulate for myself if I can "trust" some of the moderate to large effect sizes I saw with a small sample and no statistically significant $p$ values. I would greatly appreciate it if someone could help me understand this all.

Louis
  • 31
  • 1
  • This is worth a read. I particularly like the accepted answer and think the graphs in my self-answer are useful visuals. – Dave Feb 21 '24 at 16:35
  • "I did get some large effect sizes with Cohen's $$ and moderate effect sizes with SE Cohen's $$ across some of the comparisons" What do you mean by this phrase, and what is the abbreviation 'SE' meaning here? – Sextus Empiricus Feb 23 '24 at 15:33

4 Answers4

13

When your sample size is small, there will be more sampling error attached to your effect size estimates and hence more uncertainty/wider confidence intervals. In that sense, I wouldn't say the effect size estimates are "independent" of sample size. There will be greater variability from sample to sample in small samples. A good thing to do is probably to compute a confidence interval around your Cohen's d sample estimate so you can see how much uncertainty is involved given your sample size.

  • 1
    We were typing simultaneously. Luckily, we said more or less the same thing! – Peter Flom Feb 21 '24 at 16:08
  • 2
    Yes, I'm glad there appears to be substantial convergent validity of our responses ;-) – Christian Geiser Feb 21 '24 at 16:10
  • 2
    I think it's important to highlight that the "greater variability" of estimates in smaller samples should result in us having greater skepticism of any results in smaller samples. One nice thing about Bayesian modeling is that it allows us to encode that skepticism directly in our modeling procedure. – shadowtalker Feb 21 '24 at 16:33
  • I would agree in principle @shadowtalker but the one thing I would note is that because priors dominate in smaller samples, the potential for greater variability is actually worryingly worse if precision in priors is bad. – Shawn Hemelstrand Feb 22 '24 at 12:55
9

As long as the sample is random, the estimate of the effect size is unbiased even in small samples. However, other things being equal, it will be less precise. It's still your best guess, but the guess isn't very good.

Peter Flom
  • 119,535
  • 36
  • 175
  • 383
5

All things being equal, the major difference is the variation in the effect as a function of sample size. A very simple simulation in R shows this as the case.

Here I simulate two groups who have the same mean and standard deviation of a given DV. Theoretically, their Cohen's $d$ should be zero on average (because their difference in means is zero), but random sample fluctuation means this will vary. To test how much they will vary based on sample size, this code generates $B = 1,000$ simulated Cohen's $d$ values based on sample sizes from $n = 3$ to $n = 200$.

#### Load Library ####
library(effectsize)

Parameters

boot <- 1000 min.n <- 3 max.n <- 200 mean <- 50 sd <- 35

d <- rep(NA, boot) sample.size <- rep(NA, boot)

For Loop

for(i in 1:boot){

Sample size

n <- runif(1, min = min.n, max = max.n)

Sim group differences

grp1 <- rnorm(n = n, mean = mean, sd = sd) grp2 <- rnorm(n= n, mean = mean, sd = sd)

Store data

d[i] <- cohens_d(grp1, grp2)$Cohens_d sample.size[i] = n

}

Plot

plot(d ~ sample.size, main = "Cohens D by Sample Size", xlab = "Sample Size", ylab = "Cohen's D", pch = 21, bg = "gray")

Add Mean

abline(h = mean(d), lwd = 3, col = "black")

You can see from the plot that the Cohen's $d$ values hover around a pre-specified mean difference between groups (here the black line shows correctly that they average to zero), but their variation is substantial in low $n$ situations, which means higher effect sizes are more likely in small samples. You can see for example that one of the tiny samples has a Cohen's $d$ that is greater than 2, which would be quite substantial if true!

enter image description here

  • 1
    Very useful demonstration, Shawn! – Christian Geiser Feb 22 '24 at 16:15
  • 2
    This is a useful demonstration. To be clear to the OP, this is simulating two groups where the real difference in the parameter of interest is 0. This is showing that just due to sampling variation, you could get spurious estimates in small samples. – Weiwen Ng Feb 22 '24 at 18:48
  • I've edited the answer to make those points more clear (I wrote this in a hurry last night so I wasn't as clear as I probably should have been). – Shawn Hemelstrand Feb 23 '24 at 00:16
4

The question is fully addressed by meta-analysis (see, e.g., Introduction to Meta‐Analysis by Borenstein et al.), which includes studying of how the effect size vary from one study to another.

Indeed, the effect size is subject to variation from one study to another, i.e., it is itself a random variable. Thus, a large effect size measured in one study, does not mean that the effect is really large when measured in a bigger/more representative sample... or indeed that the effect even has the same sign.

Furthermore, there may be unaccounted for parameters in the study which vary from one study to another, resulting in different effect sizes (in which case we can talk about study-specific and average effect.) This is dealt with, e.g., by adopting random effects model or using meta-regression.

Roger V.
  • 3,903