2

I have 3 different DV that I try to model with 3 distinct models (linear mixed models) using the same set of IV. I found that the DV that I have the least amount of data for also has the lowest number of IV's in the final model, while the DV that has the largest amount of data has a larger number of remaining IV (including interactions). I tend to believe that this might be an artifact of data-availability. Might that be the case? And if so: Is that a serious problem when reporting these results in peer reviewed journal?

I know that many argue against stepwise reduction procedures, however my feeling is that (at least in my case) the selection mostly comes to the same final models irrespective of minor changes in the inital model. Are the constraints against model selection mostly valid in case of large numbers of IV or is it even in case of a relatively small number of initial IVs that there might be completely different results?

Frize
  • 115

2 Answers2

6

Yes. This is another argument against stepwise. Since it uses p-values, it will find more complex models when there is more data, even if the data is generated from the same underlying process.

This is true of all automatic methods, though (at least, all I can think of).

It's really another argument for using effect sizes, your substantive knowledge and your brain to figure out models.

rolando2
  • 12,511
Peter Flom
  • 119,535
  • 36
  • 175
  • 383
  • 1
    In my case I used AIC, as I don't trust the p-values that much – Frize Oct 01 '13 at 11:19
  • 1
    OK but AIC is based on likelihood and likelihood also changes with N. – Peter Flom Oct 01 '13 at 11:23
  • Using my brain is really the last option I want to go for ;)...No seriously: is there a way that I can make use of a priori knowledge and then test model fit? I tried comparing residual plots but they all look somehow similar almost irrespective of the IVs that are considered – Frize Oct 01 '13 at 12:13
  • 1
    You can compare AIC among models that are reasonable. – Peter Flom Oct 01 '13 at 17:26
3

If you increase the sample size enough, the most complex model in your candidate set will have the lowest AIC (if it contains the others as special cases). The AIC is intended to pick the model with best predictive accuracy from a candidate set of models that are all wrong—i.e. approximations to the infinite-dimensional "model" that is reality. So in this context "too big" a model is one that over-reaches & ends up adding noise, for a given sample size. For a selection rule that converges on the least-complex true model among your candidate set as sample size increases, use the Bayesian information criterion (BIC)—if you think that the truth is simple enough to be among the models you're considering.