1

I am working in R. I am doing some simulations of sample sizes by drawing random samples from two normal distributions.

I am calculating Cohen's $d$ effect size with:

library(esc)

esc_mean_sd(grp1m = ..., grp1sd = ..., grp1n = ..., # mean, sd, n of group CONTROL
            grp2m = ..., grp2sd = ..., grp2n = ..., # mean, sd, n of group TEST
            es.type = "d")

Note: from package documentation https://cran.r-project.org/web/packages/esc/esc.pdf: $d$ is standardized mean difference effect size $d$. From the function's source code: (grp1m - grp2m)/sd_pooled.

Then I am calculating sample size with:

library(pwr)

pwr.t.test(n=NULL, d=..., sig.level=0.05, power=0.8)

It happens that I get the following error:

Error in uniroot(function(n) eval(p.body) - power, c(2 + 1e-10, 1e+09)) :    f() values at end points not of opposite sign

By trial and error, I found it throws the error when $d > 5.65348...$ Anything below is fine. I also noticed sample size at this effect size (and sig.level=0.05, power=0.8) is exactly $2$.

My current solution is to artificially bring down to $5.6$ any effect size higher than that, and report $2$ as sample size. Is $2$ the theoretical smallest sample size anyway? Is this an accurate fix?

Reproducible example

library(esc)
library(pwr)

Case1: $d=2$: does not throw an error

ESC <- esc_mean_sd(grp1m = 3, grp1sd = 1, grp1n = 20,
                   grp2m = 1, grp2sd = 1, grp2n = 20,
                   es.type = "d")

pwr.t.test(n=NULL, d=ESC$es, sig.level=0.05, power=0.8)

Case2: $d=6$: throws an error

ESC <- esc_mean_sd(grp1m = 7, grp1sd = 1, grp1n = 20,
                   grp2m = 1, grp2sd = 1, grp2n = 20,
                   es.type = "d")

pwr.t.test(n=NULL, d=ESC$es, sig.level=0.05, power=0.8)
  • 1
    Because there are great many different forms of Cohen's d, it is essential that you explain exactly which one this software is computing. In general there are mathematical bounds and they depend on the formula and the data counts. – whuber Mar 31 '20 at 13:00
  • I see. "standardized mean difference effect size d" from https://cran.r-project.org/web/packages/esc/esc.pdf. From source code of the function: (grp1m - grp2m)/sd_pooled. – francoiskroll Mar 31 '20 at 13:27
  • If readers must read source code in order to understand your question, it is unlikely you will get any answers. Please make the essential information plain within your post. – whuber Mar 31 '20 at 13:30
  • Sure. My question could be better formulated anyway. I am not directly asking about the mathematics behind the effect size, more how I should go about calculating sample sizes when $d > 5.6$. – francoiskroll Mar 31 '20 at 13:33
  • 1
    That's a very different question! In any conceivable situation (except, perhaps, for a finite population) there will be a sample size associated with any effect size. Indeed, the larger the effect size, the smaller the sample that will be needed to detect it. – whuber Mar 31 '20 at 13:38
  • That makes sense. Maybe I should think about the problem from the other end, is $2$ the theoretical minimum sample size? – francoiskroll Mar 31 '20 at 13:43
  • 1
    That's an interesting question. Its answer depends on how you plan to test the data. (That is, in part, why there are so many different formulas for Cohen's d.) For instance, in some cases a sample size of $1$ will work: see https://stats.stackexchange.com/a/1836/919. – whuber Mar 31 '20 at 13:47

0 Answers0