1

I ran some linear regressions in R using lm, with an interaction term (cat x cat) as the predictor of interest (and also incorporating a covariate). To calculate the effect sizes of the interaction as a Cohen's d, I followed Jake Westfall's method as explained at http://jakewestfall.org/blog/index.php/2015/05/27/follow-up-what-about-uris-2n-rule/.

Specifically, in R, I ran:

RSS <- c(crossprod(model$residuals))
MSE <- RSS / length(model$residuals)
RMSE <- sqrt(MSE)
[code omitted to store the four cell means]
d=((A1-B1) - (A2-B2))/(2*RMSE)

But then I determined my data had some heteroscedasticity issues and redid my models with robust standard errors.

Specifically, I used the sandwich package to adjust my t- and p-values for reporting with:

coeftest(model, vcov. = vcovHC(model, type='HC3'))

And I reported the robust SEs with emmeans:

modelEM <- emmeans(model, ~MayoFile*TreatmentMDX,
                  vcov = sandwich::vcovHC(model,type='HC3'))

All fine and dandy, but now I'm stuck with a question I ought to know the answer to: Does the effect size change too now I'm using robust SEs? My gut says "No, the effect size is the same, what's changed is whether or not it's significant" - but I'm not entirely confident in that conclusion.

Thanks for reading!

  • The effect size is a population property. It doesn't care how you compute your standard errors. The estimated effect size can, of course, vary with how it is estimated. It can also change if you modify your probability model, because that would change its very meaning. – whuber Jul 04 '22 at 15:51
  • Thank you for your reply, but I'm afraid I don't really understand its applicability to this situation. Perhaps I should rephrase: when I am reporting the effect size I observed in my specific sample, will it be different if I use robust standard errors vs. if I just use the RMSE derived from a standard lm? – Jenna Clark Jul 04 '22 at 16:07
  • 1
    How did you implement the robust standard errors? Please provide that information by editing the question, as comments are easy to overlook and can be deleted. – EdM Jul 04 '22 at 18:29
  • I have edited to add code for my robust standard errors. Thanks! – Jenna Clark Jul 04 '22 at 18:40

2 Answers2

1

The definition of effect size that you're using is relative to the standard error, so if you estimate the standard error in a different way (which you do, if you switch to robust), your estimated effect size will change accordingly.

As @whuber correctly wrote in a comment, this does not mean that the true underlying effect size has changed, and if your robust standard error is indeed a better estimator of the true standard error here (which may well be), then chances are you now have a better effect size estimator.

However, when reporting, obviously you'll have to report the estimate, as you can't know what the truth is.

  • Thank you so much for this clear answer! It's bad news for me in practice but I really appreciate the clarity. I don't suppose you know of a way I can extract or calculate the RMSE taking the robust standard errors into account? – Jenna Clark Jul 04 '22 at 17:46
  • @JennaClark Unfortunately I don't have the time to think this through properly, but in principle this should be based on the weighted sum of squared errors with weights given by the diagonal elements of $\Omega^{-1}$, if I'm not mistaken, see vcovHC help page. Unfortunately the help page doesn't seem to say how to access $\Omega$, even though I suspect it must be possible somehow. – Christian Hennig Jul 04 '22 at 21:47
  • Not a problem, I really appreciate the sanity check! – Jenna Clark Jul 04 '22 at 22:00
  • 1
    Effect size is relative. to the population standard deviation, not to the standard error of the estimate. So I would say that SE has nothing to do with effect size. Though it could affect the computation of a confidence interval for effect size. – Russ Lenth Jul 10 '22 at 18:19
  • @RussLenth I was making reference to the terminology as given in the question. In any case the standard error estimate is a function of the standard deviation estimate, so they are related. (Note that we're in regression so "population standard deviation" is not quite the right term, you may mean residual sd?) – Christian Hennig Jul 10 '22 at 20:08
  • I take exception. When you refer to standard error, you have to specify "of what?" Typically, effect size is the difference of two means divided by the population SD. The latter may be estimated using the residual SD. But if you are talking about estimating the difference of two means, the SE of that estimate is a function of the covariate matrix of the two estimates. It is not a function of just the population SD except in the special case where the two variances are equal and the covariance is zero. – Russ Lenth Jul 11 '22 at 00:49
  • Moreover, when the means come from populations with different variances, Cohen's d isn't even defined. I am not a fan of effect sizes. It is an elusive thing in any but the simplest contexts, and its main application seems to be to respond to a reviewer's report requesting it, often in one of those complex situations. – Russ Lenth Jul 11 '22 at 00:56
  • @RussLenth I should have written error term sd where I wrote residual sd. Otherwise you are probably right with what you are writing here; my answer was meant to address as straight as possible what was asked, without questioning the terminology (which makes sense to do). – Christian Hennig Jul 11 '22 at 09:33
1

There are some subtleties here, depending on why robust standard errors were estimated and what is meant by the effect size.

Robust standard errors for regression coefficients do not change estimates of expected mean square in linear regression models. McNeish et al, Psychological Methods 22:114-140 (2017), in discussing cluster-robust standard errors (CR-SEs), say explicitly:

CR-SEs can output model $R^2$ and effect size measures that are identical to what would be obtained through OLS because quantities used in these calculations (sum of squares, expected mean squares) are unaffected by the statistical correction to the standard error estimates and the computational formulas are equivalent to a single-level model.

That's straightforward when the CR-SEs are used to account for correlation structures and there is no heteroscedasticity. You might think about the robust correction to coefficient standard errors as changing the effective number of degrees of freedom while using the same mean-square error. For an effect size like Cohen's d, it's the residual standard deviation that's important, not the standard errors of the regression coefficient estimates. Even if the latter are changed with CR-SEs, the former isn't.

If there is heteroscedasticity (i.e., different within-group standard deviations), as this question posits, then does it make sense to report a single effect size for an interaction based on some single standard deviation estimate? I suppose that you could extend formulas for pooled standard deviation estimates under heteroscedasticity, which derive from formulas for variances of sums. But is that what's most useful to your audience?

It might make more sense to choose a different "effect size" estimate for the interaction term, for example comparing the magnitude of the difference in predictions with the interaction versus what you might predict based on A and B alone without the interaction. To my mind, at least, that practical significance of how much the interaction improves the model is more informative than comparing the interaction to a measure of residual standard deviation.

EdM
  • 92,183
  • 10
  • 92
  • 267
  • Thank you for a lovely, well-thought out, and extremely informative answer. In my case, the main effects are completely uninteresting - the only predictor of interest is the interaction. I didn't specify this in detail, but this is a difference-in-difference model, where the interaction is actually Time by Treatment Condition. So we're really not very interested in either time or treatment condition on their own. – Jenna Clark Jul 06 '22 at 17:28
  • @JennaClark depending on details of your design, you might find the discussion on this page helpful. – EdM Jul 06 '22 at 18:07
  • @JennaClark for more extensive discussion on what might be meant by an "effect size," see this post and this document. The magnitude of the interaction term by itself, with an estimate of its error, might be a better "effect size" than trying to standardize further. – EdM Jul 06 '22 at 18:22
  • Thank you - that is a very interesting point. I think I may attempt to persuade my co-authors that this is a better approach than reporting Cohen's d, given the situation. – Jenna Clark Jul 06 '22 at 19:07