1

When we take the log of the $y$ variable of a regression and then fit the OLS estimator via $(X^TX)^{-1}X^T\log(y)$, we can interpret the regression in terms of percent change in $y$.

However, this involved taking the logarithm of the raw values.

In a GLM framework with a logarithmic link function, we would take the logarithm of the conditional expected value, not of the raw values.

Would this have the same "percent change" interpretation? What would be the pros and cons to modeling $\log\left(\mathbb E[Y\vert X=x]\right)$ instead of $\mathbb E[\log\left(Y\right)\vert X=x]?$

I think I mean to restrict the discussion to Gaussian conditional distributions, but perhaps the discussion gets (particularly) interesting if that is loosened.

Dave
  • 62,186
  • 1
    Related posts: https://stats.stackexchange.com/questions/525590/interpretation-difference-between-log-link-and-log-transformation, https://stats.stackexchange.com/questions/47840/linear-model-with-log-transformed-response-vs-generalized-linear-model-with-log, https://statmodeling.stat.columbia.edu/2006/04/10/log_transformat/ – COOLSerdash Oct 05 '22 at 14:12
  • 1
    The interpretation is not an issue. What matters is that the OLS model supposes lognormal variation in the response whereas the GLM model supposes Gaussian variation in the response. If somehow you could persuade GLM software to use a lognormal response distribution, the fits ought to be identical. – whuber Oct 05 '22 at 14:35

0 Answers0