I have two nested, non-linear models for the same data, and I want to test whether the more complex model explains significantly more variance. Due to a necessary smoothing step, my data aren't independent, so I can't use standard tests for this (e.g. an F-test or likelihood-ratio test). I naturally thought of using a permutation test because I can permute the data before the smoothing step. This way I would introduce the same dependencies in the permuted data that also exist in the observed data, s.t. the simulated null distribution is fair. However, I can't quite come up with the right way to do this.
I can think of a way to test whether either model explains significant variance. In this case, my algorithm would be:
- Fit the model to the smoothed observed data, and calculate $R^2$
- For 10,000 iterations, repeat steps 3-5:
- Randomly permute the observed data (i.e. randomly shuffle predictor and response values s.t. their true relationship is destroyed)
- Apply smoothing to this permuted data to introduce the same dependencies that the observed $R^2$-value suffers from
- Fit the model to the smoothed permuted data and calculate $R^2$
- Compare the observed $R^2$-value to the thus-constructed null distribution of $R^2$-values for this data & model
The point being that in this case it's easy to construct the null distribution, because the null hypothesis holds that there is no relationship between the predictors in the model and the outcome variable, and it is obvious how we can permute the data to emulate this. (Formally, I guess the null hypothesis is that the values of the predictors are exchangeable w.r.t. the values of the outcome variable.)
However, what I want to do is estimate a null distribution for the increase in $R^2$ from one model to the next. The null hypothesis here is that the added parameter in the more complex model is meaningless, i.e. that the two models are exchangeable. However, it seems to me that this is an exchangeablility hypothesis on the models, rather than on (some aspect of) the data, so I just don't see what I would permute in a permutation test in order to simulate this. Can anyone help me out? Am I missing something, or is this just not possible and perhaps I can only do a bootstrap here?
Update: in the original version of this question, I failed to specify that my models were non-linear. That is, they don't have the form $Y=X\beta+\epsilon$ (where $X$ is a matrix of predictors, $\beta$ is a vector of linear coefficients and $\epsilon$ is random noise). Instead, they have the more general form $Y=f(X; \theta)+\epsilon$, where $\theta$ is a vector of model parameters. The simpler model can be obtained by fixing the value of one of the parameters of the more complex/general model.
https://stats.stackexchange.com/questions/213895/how-to-check-permutation-testing-exchangeability-assumption-when-using-a-general
Can't flag b/c of the bounty.
– eric_kernfeld Aug 23 '17 at 15:37