4

I have a linear model of a dependent variable, $y$, with two predictor variables, year and site, and their interaction, with year being numeric and site categorical.

The main effect of year is not significantly different from zero, with an estimated value of 0.02312 for the $y$ on year slope.

Some of the year by site interactions are significant. The summary of the linear model in R gives estimates for the interaction terms as deviations from the main effect of year.

If, for example, the year by site 1 estimate is reported as .416 and is significant, then to compute the $y$ on year slope within site 1, should I compute 0 + .416 or 0.02312 + .416?

chl
  • 53,725
Jdub
  • 569

2 Answers2

5

Andrew Gelman's tentative advice on that is based on the significance and the sign of the predictors:

  • If predictor is significant: keep it (if it has the unexpected sign: think hard about it!)
  • If predictor is not significant but in the expected direction: keep it. It will not improve the prediction dramatically, but won't do much hurt.
  • If predictor is not significant and in the unexpected direction: set it to zero (p. 69 in "Data Analysis Using Regression and Multilevel/Hierarchical Models", 2007)

This presupposes to think about the expected directions of predictors before running the model ...

Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press.

Felix S
  • 4,700
  • This is interesting (thank you), but I don't know if I would do that. If you had a positive, but non-significant, slope for a main effect, and a significant positive slope in one group with a significant negative slope in a second group, it seems that you would, following this advice, add the n.s. main effect to the positive within group slope but not to the negative within group slope. This sounds kind of fishy to me. – Jdub May 24 '12 at 23:52
  • If you have cross-over interactions, this might et difficult, indeed. But I think in your case the main question would be: did you expect a positive main effect of 'year'? Then add it to all slopes. – Felix S May 25 '12 at 09:12
1

If you think there is an interaction and no main effect model it that way and interpret its effect based on the coefficient of the interaction term. There is guidance to say that interactions should only be looked at if the main effects are significant. Sometimes there is justification for that. But it is not a law of statistics that is set in stone. It would be nice though if you have some subject matter rationale for the existence of the interaction without a main effect.

  • I don't have any problem justifying the model, I'm just wondering about the technical problem in the last sentence of the question. Should the non-significant main effect be treated as zero or the small non-zero estimate for the sake of reporting the significant within site slopes? It probably doesn't make a practical difference, but which way is correct? – Jdub May 23 '12 at 00:02
  • I would drop the main effect on the basis of parsimony. There is no right answer here. Either you include a term that is nonsignificant which violates a parsimony principle or you drop the main effect which violates a rule of not including an interaction without a main effect. In or out the main effect term is neither practical nor theoretically important. – Michael R. Chernick May 23 '12 at 01:04
  • In response to Michael Chernick, is it not possible that we have a non-significant predictor which can predispose or sensitize our patients to a couple of other predictors (significantly)? In such a case, we would need to assess its interaction with other predictors, eventhough it itself is non-significant. Am I right? – Vic Apr 27 '13 at 16:11