7

I have a panel data model with double-log functional form. I have 4 variables, one of which is a dummy. What is the best way to transform the values of 0 for my dummy to be able to take natural logs (since you can't take logs from zero)? I have found 2 options so far, but I dont know which is the best (or perhaps there is a 3rd option).

1) Transform all the values of 0 into a very small value (0.00000000001) and then take the logs. I'm not sure this wouldn't still change the outcome of my model quite a bit. 2) Change the values of 0 into 1 and the dummy values of 1 into e (the base of natural logarithm). The log of this newly defined dummy then takes on the values of zero and one and the interpretation of my betas would remain the same.

Marie
  • 121
  • But what if you are estimating a translog production function? By definition, all the variables must be in logs, including dummy variables Dan –  Oct 27 '13 at 17:17
  • Welcome to the site, @Dan. This is not an answer to the OP's question. Please only use the "Your Answer" field to provide answers. If you have your own question, click the ASK QUESTION at the top & ask it there; then we can help you properly. Since you are new here, you may want to read our about page, which has information for new users. – gung - Reinstate Monica Oct 27 '13 at 17:38

2 Answers2

17

Just because you're taking logs of some of the variables in your model, there's no reason you have to take logs of all of them. Leave a 0/1 coded dummy variable as it is.

onestop
  • 17,737
  • 2
  • 62
  • 89
11

I agree with onestop. You may also find this blog post from Econometrics Beat useful in learning how to interpret the coefficients on dummy variables when the dependent variable is logged:

http://davegiles.blogspot.com/2011/03/dummies-for-dummies.html

The Cliffs Notes version is that for a model like \begin{equation} \ln(Y) = a + b \cdot \ln(X) + c \cdot D + \varepsilon, \end{equation} where $X$ is a continuous regressor, and $D$ is a zero-one dummy variable.

If $D$ switches from 0 to 1, the % impact of $D$ on $Y$ is $100 \cdot (\exp(c)-1).$
If $D$ switches from 1 to 0, the % impact of $D$ on $Y$ is $100 \cdot (\exp(-c)-1).$

And don't read anything into the title of the post.

dimitriy
  • 35,430
  • (+1) In a panel where some units may switch from 0 to 1, and then from 1 back to 0, should our interpretation account for this? I wonder how movement in and out may affect the coefficient, if at all. – Thomas Bilach Apr 20 '21 at 18:59
  • I think the coefficient would be the same (the policy dummy), but you would need two different transformations of it, as I have above. If you think treatment takes a while to decay, you can break out the treatment into several regime variables. – dimitriy Apr 20 '21 at 19:04
  • I’m sorry Dimitriy I meant the interpretation. If the dummy keeps switching on and off, is it just $e^c$, which is just giving us the regime where it turns on? Suppose the panel is daily data and subjects receive the same treatment every Tuesday for a month. How would the interpretation work in that setting? If this makes for a new question let me know. – Thomas Bilach Apr 20 '21 at 19:34
  • @ThomasBilach I am not sure I follow, so perhaps a new question with a specification is best. – dimitriy Apr 20 '21 at 19:39
  • Sure. If you have time to look it over, I generated a new question. – Thomas Bilach May 01 '21 at 18:23