10

Interpretation of log transformed predictor neatly explains how to interpret a log transformed predictor in OLS. Does the interpretation change if there are 0s in the data and the transformation becomes log(1 + x) instead?

Some authors (e.g. Fox and Weisberg 2011) recommend adding a start (i.e. a positive constant) if a log transformation is necessary to correct skewness and improve symmetry, but the data contains zeros.

Consider a variation of the Ornstein example in CAR (p. 303):

require(car)
data(Ornstein)
boxplot(Ornstein$interlocks, horizontal = T) 

enter image description here

The data is clearly right skewed, and contains 0s.

summary(powerTransform(1 + Ornstein$interlocks))
## bcPower Transformation to Normality 
## 
##                         Est.Power Std.Err. Wald Lower Bound Wald Upper Bound
## 1 + Ornstein$interlocks    0.1248    0.053           0.0209           0.2287
## 
## Likelihood ratio tests about transformation parameters
##                              LRT df      pval
## LR test, lambda = (0)   5.502335  1 0.0189911
## LR test, lambda = (1) 262.431991  1 0.0000000

The powerTransform() function suggests that a log(1 + x) transformation here could be useful.

boxplot(log(1 + Ornstein$interlocks), horizontal = T)

enter image description here

As you can see, symmetry is indeed improved.

Question: If this transformed variable were to be included in an OLS regression as an IV, would the coefficient estimates still have the usual interpretation of log transformed variables?

landroni
  • 1,123

1 Answers1

5

It depends. According to Wooldridge (2012) the percentage change interpretations are often closely preserved, except for changes beginning at $y = 0$ (where the percentage change is not defined). Strictly speaking, using $\log(1+y)$ and then interpreting the estimates as if the variable were $\log(y)$ is acceptable only if the data on $y$ contain relatively few zeros.

Nick Cox
  • 56,404
  • 8
  • 127
  • 185
Repmat
  • 3,562
  • Do you have a page number for Wooldridge? – dimitriy Jul 05 '15 at 16:27
  • 1
    @DimitriyV.Masterov In Wooldridge 2009 it's p.192 (Chapter 6.2 More on functional form). – landroni Jul 05 '15 at 18:00
  • 1
    In the 2012 EDT (US version), it is at the button of page 193 – Repmat Jul 05 '15 at 18:19
  • 1
    This answer is not quite correct and might be misleading. What matters isn't whether $y$ includes "relatively few zeros," but the actual values of $y$ relative to $1.$ See https://stats.stackexchange.com/questions/576504 for more accurate answers. – whuber May 25 '22 at 12:20