12

I have read several publications attempting to justify the use of a fixed effects model with statements along the lines of "the fixed-effects model was chosen because the heterogeneity was low". However, I am concerned it may still be an inappropriate approach to data analysis.

Are there reasons or publications that discuss whether and why this could be a mistake?

cmirian
  • 259
  • 1
    Arguably a duplicate with a good answer: https://stats.stackexchange.com/questions/156603 – amoeba Jan 30 '18 at 20:29

3 Answers3

21

Note: If you want a quick answer to your question regarding using the heterogeneity test to make this decision, scroll down to "Which Justifications Are Reasonable?".

There are a few justifications (some more/less reasonable than others) that researchers offer for their selection of a fixed-effects vs. random-effects meta-analytic synthesis. These are discussed in introductory meta-analysis textbooks, like Borenstein et al. (2009), Card (2011), and Cooper(2017).

Without condemning or condoning any of these justifications (yet), they include:

Justifications for Selection of Fixed-Effects Model

  1. Analytic Simplicity: Some folks feel the calculation/interpretation of a random-effects model is beyond their statistical understanding, and therefore stick to a simpler model. With the fixed-effect model the researcher only needs to estimate variability in effect sizes driven by sampling error. For better or worse, this is a pragmatic practice encouraged explicitly in Card (2011).
  2. Prior Belief in No Study-Level Variability/Moderators: If a researcher believes that all effect sizes in their sample vary only because of sampling error--and that there is no systematic study-level variability (and therefore no moderators--there would be little imperative to fit a random-effects model. I think this justification and the former sometimes walk hand-in-hand, when a researcher feels fitting a random-effects model is beyond their capacity, and then subsequently rationalizes this decision by claiming, after the fact, that they don't anticipate any amount of true study-level heterogeneity.

  3. Systematic Moderators Have Been Exhaustively Considered: Some researchers may use a fixed-effect analysis after they have investigated and taken into account every moderator that they can think of. The underlying rationale here is that once a researcher has accounted for every conceivable/meaningful source of study-level variability, all that can be left over is sampling error, and therefore a random-effects model would be unnecessary.

  4. Non-Significant Heterogeneity Test (i.e., $Q$ Statistic): A researcher might feel more comfortable adopting a fixed-effects model if they fail to reject the null of a homogenous sample of effect sizes.
  5. Intention to Make Limited/Specific Inferences: Fixed-effects models are appropriate for speaking to patterns of effects strictly within the sample of effects. A researcher might therefore justify fitting a fixed-effects model if they are comfortable speaking only to what is going on in their sample, and not speculating about what might happen in studies missed by their review, or in studies that come after their review.

Justifications for Selection of a Random-Effects Model

  1. Prior Belief in Study-Level Variability/Moderators: In contrast to Justification 2. (in favour of fixed-effects models), if the researcher anticipates that there will be some meaningful amount of study-level variability (and therefore moderation), they would default to specifying a random-effects model. If you come from a psychology background (I do), this is becoming an increasingly routine/encouraged default way of thinking about effect sizes (e.g., see Cumming, 2014).

  2. Significant Heterogeneity Test (i.e., $Q$ Statistic): Just as a researcher might use a non-significant $Q$ test to justify their selection of a fixed-effects model, so too might they use a significant $Q$ test (rejecting the null of homogenous effect sizes) to justify their use of a random-effects model.

  3. Analytic Pragmatism: It turns out, if you fit a random-effects model and there is no significant heterogeneity (i.e., the $Q$ is not significant), you will arrive at fixed-effect estimates; only in the presence of significant heterogeneity will these estimates change. Some researchers might therefore default to a random-effects model figuring that their analyses will "work out" the way they ought to, depending on the qualities of the underlying data.

  4. Intention to Make Broad/Generalizable Inferences: Unlike with fixed-effects models, random-effects models license a researcher to speak (to some degree) beyond their sample, in terms of patterns of effects/moderation that would play out in a broader literature. If this level of inference is desirable to a researcher, they might therefore prefer a random-effects model.

Consequences of Specifying the Wrong Model

Though not an explicit part of your question, I think it's important to point out why it is important for the researcher to "get it right" when selecting between fixed-effects and random-effects meta-analysis models: it largely comes down to estimation precision and statistical power.

Fixed-effects models are more statistically powerful at the risk of yielding artificially precise estimates; random-effects models are less statistically powerful, but potentially more reasonable if there is true heterogeneity. In the context of tests of moderators, fixed-effect models can underestimate the extent of error variance, while random-effects models can overestimate the extent of error variance (depending on whether their modelling assumptions are met or violated, see Overton, 1998). Again, within the psychology literature, there is an increasing sense that the field has relied too heavily on fixed-effects meta-analyses, and that therefore we have deluded ourselves into a greater sense of certainty/precision in our effects (see Schmidt et al., 2009).

Which Justifications Are Reasonable?

To answer your particular inquiry directly: some (e.g., Borenstein et al., 2009; Card, 2001) caution against the use of the heterogeneity test statistic $Q$ for the purpose of deciding whether to specify a fixed-effects or random-effects model (Justification 4. and Justification 7.). These authors argue instead that you ought to make this decision primarily on conceptual grounds (i.e., Justification 2. or Justification 6.). The fallibility of the $Q$ statistic for this purpose also makes a certain amount of intuitive sense in the context of especially small (or especially large) syntheses, where $Q$ is likely to be under-powered to detect meaningful heterogeneity (or over-powered to detect trivial amounts of heterogeneity).

Analytic simplicity (Justification 1.) seems like another justification for fixed-effects models that is unlikely to be successful (for reasons that I think are more obvious). Arguing that all possible moderators have been exhausted (Justification 3.), on the other hand, could be more compelling in some cases, if the researcher can demonstrate that they have considered/modelled a wide range of moderator variables. If they've only coded a few moderators, this justification will likely be seen as pretty specious/flimsy.

Letting the data make the decision via a default random-effects model (Justification 8.) is one that I feel uncertain about. It's certainly not an active/principled decision, but coupled with the psychology field's shift towards preferring random-effects models as a default, it may prove to be an acceptable (though not a particularly thoughtful) justification.

That leaves justifications related to prior beliefs regarding the distribution(s) of effects (Justification 2. and Justification 6.), and those related to the kinds of inferences the researcher wishes to be licensed to make (Justification 5. and Justification 9.). The plausibility of prior beliefs about distributions of effects will largely come down to features of the research you are synthesizing; as Cooper (2017) notes, if you are synthesizing effects of mechanistic/universal processes, collected from largely similar contexts/samples, and in tightly controlled environments, a fixed-effects analysis could be entirely reasonable. Synthesizing results from replications of the same experiment would be a good example of when this analytic strategy could be desirable (see. Goh et al., 2016). If, however, you're synthesizing a field where designs, manipulations, measures, contexts, and sample characteristics differ quite a bit, it seems to become increasingly difficult to argue that one is studying exactly the same effect in each instance. Lastly, the kinds of inferences one wishes to make seems a matter of personal preference/taste, so I'm not sure how one would begin to argue for/against this justification as long as it seemed conceptually defensible.

References

Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction to meta-analysis. West Sussex, UK: Wiley.

Card, N. A. (2011). Applied meta-analysis for social science research. New York, NY: Guilford Press.

Cooper, H. (2017). Research synthesis and meta-analysis: A step-by-step approach. Thousand Oaks, CA: Sage.

Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25(1), 7-29.

Goh, J. X., Hall, J. A., & Rosenthal, R. (2016). Mini Meta‐Analysis of Your Own Studies: Some Arguments on Why and a Primer on How. Social and Personality Psychology Compass, 10(10), 535-549.

Overton, R. C. (1998). A comparison of fixed-effects and mixed (random-effects) models for meta-analysis tests of moderator variable effects. Psychological Methods, 3(3), 354-379.

Schmidt, F. L., Oh, I. S., & Hayes, T. L. (2009). Fixed‐versus random‐effects models in meta‐analysis: Model properties and an empirical comparison of differences in results. British Journal of Mathematical and Statistical Psychology, 62(1), 97-128.

jsakaluk
  • 5,514
  • 1
  • 23
  • 47
  • 1
    It is hard to deny that studies, especially those of biological systems, possess unmeasured characteristics that create important between-study differences. – Todd D Jan 31 '24 at 05:02
10

You use a fixed-effects model if you want to make a conditional inference about the average outcome of the $k$ studies included in your analysis. So, any statements you make about the average outcome only pertain to those $k$ studies and you cannot automatically generalize to other studies.

You use a random-effects model if you want to make an unconditional inference about the average outcome in a (typically hypothetical) population of studies from which the $k$ studies included in your analysis are assumed to have come. So, any statements you make about the average outcome in principle pertain to the that entire population of studies (assuming that the $k$ studies included in your meta-analysis are a random sample of the studies in the population or can in some sense be considered to be representative of all of those studies).

A very common misconception is that the fixed-effects model is only appropriate when the true outcomes are homogeneous and that the random-effects model should be used when they are heterogeneous. However, both models are perfectly fine even under heterogeneity -- the crucial distinction is the type of inference you can make (conditional versus unconditional).

In fact, it is also perfectly fine to fit both models: Once to make a statement about the average outcome of those $k$ studies and once to try the more difficult task of making a statement about the average effect 'in general'.

Wolfgang
  • 16,997
  • Thanks for a clear answer. I believe power of any meta-analysis will be less for random-effects model. On the other hand, usually the idea is to find what is happening in the population rather than just in those studies. So I presume that random-effects model needs to be used most of the time. Are there any circumstances when fixed effects model is appropriate and random-effects model is not? – rnso Jun 12 '15 at 10:13
  • "When you want to make a conditional inference about the average outcome of the $k$ studies included in your analysis." Yes, I am repeating my answer, but that's a circumstance when a fixed-effects model is appropriate and a random-effects model is not. – Wolfgang Jun 13 '15 at 12:29
  • Might you be able to cite a good paper or link which gives an example of the two types of inferences at a step-by-step level? – cerd Mar 03 '16 at 13:51
5

You ask in particular for references.

The classical reference for this is probably the article by Hedges and Vevea entitled "Fixed- and random-effects models in meta-analysis".

If you work in health the relevant chapter in the Cochrane handbook is probably essential reading and contains much good sense. In particular it suggests when meta-analysis should not be considered at all and also distinguishes clearly what to do about heterogeneity other than simply fitting random effects models.

mdewey
  • 17,806