2

I am analysing cross-section data from two time points, i.e. before and after an intervention and I am particularly interested in the causal effect of the intervention. The outcome of interest ($Y$) is metric and I have some control variables (all nominal or ordinal) like gender, size of company etc. In order to calculate the effect of the intervention I included a pre/post dummy into a regression model: $Y = a + b1*PrePost + b2*male + b3*small \ enterprise + b4*large \ enterprise$

I need some help with the interpretation: $b1$ is the effect of the intervention when all control var are held constant (in the two groups, pre- and post), right? Is that true even when one category of the dummies is left out?

How to interpret the coeffs on the dummies in relation to $b1$ (the most interesting coefficient of the intervention effect)?

How do I calculate the effect of the intervention ($b1$) for various subgroups defined by the dummies? Do I have to include the interaction of each dummy-categorie and the PrePost dummy - in addition to all group dummies? E.g. $Y = a + b1*prepost + b2*male + b3*small \ enterprise + b4*male*prepost + b5*small \ enterprise*prepost$

How are these coeffs then interpreted?

  • Welcome user124000. I have slightly edited your post to make it more readable, you should check that I have not introduced any error. Note that your username, identicon, & a link to your user page are automatically added to every post you make, so there is no need to sign your posts. In addition, there's no need to say "thank you" at the end of your post - it might seem rude at first, but it's part of the philosophy of this site ([tour]) to "Ask questions, get answers, no distractions". – Antoine Vernet Jul 21 '16 at 08:01

1 Answers1

1

In your first model, $\beta_1$ reflects the estimated effect of the intervention at all levels of all your control variables. So regardless of whether you're male or female, or the size of your enterprise, the estimated effect of the intervention is $\beta_1$.

If you want to allow every subgroup to have their own effect then you would fit an interaction. If you're wanting to do this for every subgroup then you'll need to include an interaction term containing all of the variables. This will mean you need two-way and three-way interactions:

$Y=\beta_0+\beta_1∗prepost+\beta_2∗M+ \beta_3∗SE+ \beta_4∗LE+ \beta_5*M*prepost+ \beta_6*SE∗prepost+ \beta_7*LE∗prepost+ \beta_8*M*SE+ \beta_9*M*LE+ \beta_{10}*M*SE*prepost+ \beta_{11}*M*LE*prepost $

How you interpret the coefficients will depend on how you choose to code the variables. This, in term, will depend on the research question that you have but very generally they'll tell you whether the effect of the intervention varies depending on differences in the subgroup a person is in.

Ian_Fin
  • 1,199
  • 8
  • 18
  • To verify that I got it right: In the first model, b1 is the effect of the intervention if the two groups (pre vs. post) would be completely similar in the characteristics which are controled for with the included covariates, right? How do I interpret the dummy coeffs? E.g. b2 (male) then tells me the additional effect for males compared to females? So effect on males would be b1+b2? – user124000 Jul 21 '16 at 11:04
  • 1
    b1 is the effect of the intervention IF we assume that the effect was the same for males and females (and the size of enterprise). The control coefficients b2 tells you the size of the difference between males and females, irregardless of whether or not they've had the intervention. b1+b2 estimates what Y would be for males after the intervention, assuming that males and females (and small and large enterprises) don't vary in the effect of the intervention. Remember that betas are estimates of the effect, rather than the true effect, and estimates are based on assumptions which may be wrong – Ian_Fin Jul 21 '16 at 11:40
  • So if I assume differing effects, I would use the following model: – user124000 Jul 21 '16 at 13:47
  • Y = a + b1post + b2male +b3SE + b4ME + b5postmale + b6postSE + b7postME where "post" is a dummy 0 for pre-intervention and 1 for post-intervention, "male" a dummy 0-female and 1-male, "SE" a dummy for small enterprises 1-small enterprise and 0 for medium and large enterprises and "ME" a dummy for medium enterprises 1 for medium enterprises and 0 for small and large enterprises. So large enterprises are reference. Here the coefficients would mean: – user124000 Jul 21 '16 at 13:59
  • b1=the effect of the intervention (difference of mean pre vs. post) for the combination of reference categories. I.e. here for females in large enterprises. b2=difference of male vs. females in Y before the intervention. b3=difference of small vs. large enterprises in Y before the intervention. b5=the difference of effect between male and female. b6=the difference of effect between small and large enterprises and so on. How do I calculate e.g. the effect for male - regardless of other characteristics. – user124000 Jul 21 '16 at 14:09
  • I assume that for b2 you also realise that this is male vs. female before the intervention in a large enterprise. Similarly, for b3 it's the difference small vs. large before for females. b6 is difference in effect between small and large for females. etc. – Ian_Fin Jul 21 '16 at 14:16
  • The effect "regardless of other characteristics" is known as the main effect. Sum coding all of your predictors (e.g., setting their levels to -1 and 1) will tell you about main effects, and also change the interpretation of all your other coefficients. What you probably ought to do is read about coding, and its effects on interpretation, and then decide what coding scheme is most appropriate to your research question. – Ian_Fin Jul 21 '16 at 14:18
  • how would a model then look like - with same coding(0, 1) - that shows me the different effects for male and female while controlling for the size of enterprise? – user124000 Jul 21 '16 at 14:59
  • It is controlling for the size of the enterprise. It's telling you whether there's a different effect for males and females in a large enterprise. That is controlling the size. If you mean that you want to know whether there's an effect of gender across all sizes of enterprise then you can't do that with this coding, and you need to use sum coding, as I recommended. – Ian_Fin Jul 21 '16 at 15:09