I have regression results where unconstrained OLS is near optimal - out of sample scores are almost the best when compared to some other constrained regression models. Although the ratio of number of observations to features is high, some features have high correlation and I expect many not to be useful.
I want to be able to explain those results intuitively and check that my results are correct, as I had expected unconstrained OLS to do poorly.
As sanity check, I confirm that PCA regression is optimal when the number of components is close to the number of features. Results are bad when number of components is small. On the contrary, PLS does well with a small number of components.
Are variance inflation factors relevant here? Considering I care more about out of sample prediction scores (eg, R2) than estimation error of the parameters (which indeed could be high for unconstrained regression). In principle, if I orthogonalised all my features, then my inflation factor would be reduced but the dimension wouldn't change so I would expect same prediction score, hence this is not relevant here.
I understand how parameter estimation error blows up with feature correlation considering the variance is inverse of X^TX, but I care more about predictions scores (error estimating Y) - any reference for a discussion on this?