4

I have a linear model which doesn't have any particular issues with its assumptions (diagnostics plots look well). However it has a slighly skewed response (skewness approx. 0.5) and few skew independent variables. Now I would like to try to transform these skewed variables (as well as response) and see if the transformed model is better (that is if it explain response better). However, as far as I know I can't use neither R^2 statistic or AIC/BIC criterion to compare model with different data as they differ in variance. What should I do to compare the new, transform model with the old one? What criterion can I use?

jakes
  • 215
  • You might try "de-transforming" the predicted values for calculation of fit statistics. For example, if you fit to the log if the independent variable, that is log(IV), you can take the exponent of the model's predicted values as the "de-transform" for calculation of R-squared etc. – James Phillips Jul 06 '18 at 13:45
  • First find a way to backtransform to the original scale, see https://stats.stackexchange.com/questions/115571/back-transforming-regression-results-when-modeling-logy or https://stackoverflow.com/questions/46392683/how-to-plot-transformed-regression-back-on-original-scale. But that will introduce bias, so some debiasing could do good. Maybe see https://www.sciencedirect.com/science/article/abs/pii/S0167629601000868. Then too compare backtransformed predictions and predictions on the original scale, maybe use crossvalidation. – kjetil b halvorsen Jul 07 '18 at 17:32
  • @kjetilbhalvorsen I'm trying to lessen the number of unanswered questions. I know you wrote this a long time ago, but it looks like it could be an answer. Do you want to write it? – Peter Flom Dec 05 '23 at 13:11
  • @PeterFlom: Done – kjetil b halvorsen Jan 03 '24 at 20:44

1 Answers1

0

For the transformed model, find a way to backtransform. See for instance Which is the best backtransformation correction method for log(outcome) predictions? or Back transforming regression results when modeling log(y). Predictions after backtransformations might be biased, so some debiasing might be in order. The first link above gives one way, you could test/modify using crossvalidation.

Then finally, to compate the two models after backtransformation, you might use cross validation. As for information criteria lie AIC, have a look at Comparing AIC of a model and its log-transformed version