I have an outcome variable that is right skewed, so I log transformed it. I made a null model with only the log-transformed outcome variable, but when I exponentiate the estimate, it does not equal the mean.
Concerned it was issues with my data, I made a sample data set and found the same discrepancy. Why is this? What does the intercept represent in this model?
Here is the sample data and R code:
library(tidyverse)
test <- tibble(salary = c(10000, 23244, 2222222, 2353, 2353463, 5464564),
perf = c(4, 2, 4, 2, 5, 7))
Here's my null model:
summary(lm(log(salary) ~ 1 , data = test))
The intercept equals 11.971, which when I use exp(11.971), I get 158102.7:
exp(11.971)
But the mean is 1679308:
mean(test$salary)
And, as a sanity check, when I don't log transform the outcome, the intercept does produce the mean:
summary(lm(salary ~ 1 , data = test))
I'd appreciate 1) how to interpret the intercept, 2) why it doesn't equal the mean, and 3) how I could get non-log predictions from this model.
exp(log(mean(x)))is equal tomean(x),exp(mean(log(x)))is not. – Frans Rodenburg Mar 03 '21 at 07:51forecastpackage for a wider class of transformation (box-cox transformation) of dependent variable. See here: https://otexts.com/fpp2/transformations.html#mathematical-transformations – Dayne Mar 04 '21 at 04:42