0

It is a pretty general question: a counterexample should also help.

Nick Cox
  • 56,404
  • 8
  • 127
  • 185

2 Answers2

4

No, especially when the goal is inference. Box-Cox transforms will usually change the data in a way which makes interpretation of the coefficients of your model very difficult.

Aside from inference, I suppose it depends on the method. I know Linear Discriminant Analysis assumes covariates are multivariate normal, so it might help in those instances.

Nick Cox
  • 56,404
  • 8
  • 127
  • 185
  • 3
    When a Box-Cox transformation is developed according to a principled exploratory data analysis, it's often the case that it is improves the interpretation of coefficients. In regression, for instance, the objective is to make the relationship with the response linear, which is about the simplest--and therefore the most easily interpretable--possible relationship. See https://stats.stackexchange.com/questions/4831, and https://stats.stackexchange.com/questions/298, https://stats.stackexchange.com/questions/35711 for more detailed discussions. – whuber Dec 21 '18 at 16:22
  • @whuber Don't you lose the "a unit increase in the predictor leads to a beta increase in the outcome" interpretation of coefficients then? If I transform weight through some Box-Cox transform (which is not the logarithm), how am I to interpret the resulting coefficient? – Demetri Pananos Dec 21 '18 at 16:31
  • 1
    My point is that you gain that interpretation, which was earlier incorrect. If indeed the relationship between response and regressor in the original units of measurement is nonlinear, then it is wrong to assert "a unit increase ... leads to a beta increase," because (by the very definition of non-linear) that is false no matter what value beta might have. Physics and chemistry offer standard, nice examples as illustrated in my answer in the third linked thread I provided. – whuber Dec 21 '18 at 16:47
0

The simple answer is "No".

My own longer answer is that

1) As @gung pointed out in a comment, predictor variables don't have to be normal. In fact, they don't have to be continuous. Linear regression makes assumptions about the error, not the variables.

2) Even if the assumptions are violated, I would say that it is rarely a good idea to transform variables based solely on statistical grounds. Instead, you should use a method that does not make the assumptions that were violated.

Peter Flom
  • 119,535
  • 36
  • 175
  • 383