Model selection across multiple criteria (qualitative and quantitative)

Question

I have two linear regression models on the same data, but where the response variable has been transformed using respectively the BoxCox transformation and the logit transformation. Therefore, I cannot use AIC or other maximum likelihood based criteria for model selection.

Instead, I'm trying to select the "best" model using other criteria. I've come up with the 3 criteria below:

Parsimony (simpler models should preferred over complex models)
Adherence to theoretical assumptions (examining if residuals are iid using residual analysis)
Empirical performance (MAE/MSE using leave-one-out cross-validation)

I am unsure how to rank these 3 criteria, which makes it hard for me to determine which of the two models I should use. In what situations should I rank one criteria higher than the other two and would you recommend a general ranking?

score 2 · Accepted Answer · edited Jun 11 '20 at 14:32

The best criteria in your case is out-of-sample MAE. Here are the reasons why the out-of-sample MAE (OMAE) is better than the alternative measures you mentioned:

Since you used Box-Cox, you results are only optimal in terms of a symmetric linear loss. Therefore, in order to evaluate the performance of your models, you need to use the linear loss, see (Davydenko and Fildes, 2016):

if we stabilise the variance by log-transformations and then transform back forecasts by exponentiation, we get forecasts optimal only under linear loss. If we use another loss, we must first obtain the density forecast using a statistical model, and then adjust our estimate given our specific loss function (see examples of doing this in Goodwin, 2000).Let’s assume we want to empirically compare two methods and find out which method is better in terms of a symmetric linear loss (since this type of loss is commonly used in modelling). If we have only one time series, it seems natural to use a mean absolute error (MAE).
If a model is over-fitted, this theoretically should result in a higher OMAE value.
You can also report residuals diagnostics, but if your aim is to get better accuracy, the results of statistical tests can be less important compared to the OMAE value.

References:

Davydenko, A., & Fildes, R. (2016).Forecast Error Measures: Critical Review and Practical Recommendations. InBusiness Forecasting: Practical Problems and Solutions.John Wiley & Sons Inc.

Welcome to the site, @Turbofly. I hope we'll see more like this in the future. — gung - Reinstate Monica, Feb 29 '16 at 02:51

Model selection across multiple criteria (qualitative and quantitative)

1 Answers1