I want to compare two models of the form:
model of interest: y ~ measure_of_interest1 + measure_of_interest2 + confound1+ confound2
control model: y ~ confound1+ confound2
I will collect a test data set for the model comparison.
I am thinking of using the fitted models from the training set and then compute $R^2$ for the test data. But then I'm not certain how to compare the two models. I was wondering whether I should compute the difference in the $R^2$ of the two models and then compare this to 1000 permutations (shuffling the y values). Is that a sensible method?
I'd be very grateful for any pointers.
I detail a few of the methods here to help you decide, what I believe to be objectively, which model is best to keep: https://stats.stackexchange.com/questions/571180/does-my-predictor-in-my-multiple-regression-have-too-many-variables/571186?noredirect=1#comment1055632_571186
– Andy Jul 05 '22 at 15:07Let's say I was using mean squared error instead, how would I compare the two models? Make the difference and compute to a permutation?
– JacquieS Jul 05 '22 at 15:23Then the only things I need to decide on:
- is comparing R^2 to a permutation appropriate
- what is the difference of R^2 vs say the difference in RMSE between the two models (again compared against a permutation for significance)?
– JacquieS Jul 06 '22 at 08:36