NOT A DUPLICATE For the persons who marked this question as a duplicate of the post I mentioned in my original post: this is not a duplicate, as the correlation obtained with an intercept-only linear model would be NaN or 0, not 0.1 as mentioned in my post. Further I am asking how to use the most common R squared formulation and no answer is provided to that post.
Original post
I used caret with glmnet and selected repeatedcv (10-fold, 5 repeats) to choose the best glmnet parameters (alpha and lambda) in terms of Rsquared (i.e. better = larger Rsquared).
library(glmnet)
library(caret)
alpha.grid <- (1:20) * 0.05
lambda.grid <- 10^seq(4,-4,length=200)
EN.param.grid <- expand.grid(.alpha=alpha.grid, .lambda=lambda.grid)
train.params <- trainControl(method="repeatedcv", number=10, repeats=5)
EN.fit <- train(x=X, y=Y, method="glmnet", tuneGrid=EN.param.grid,
trControl=train.params, standardize=TRUE, metric="Rsquared")
Despite the best model having no remaining predictors (i.e. only an intercept), the associated Rsquared as provided by caret (as seen in EN.fit$results$Rsquared dataframe) should be 0 but it is not (in my case 0.1). Then, questions coming to mind are:
- Is the test
Rsquaredcalculated in caret in terms of correlation instead of the more traditional sum-of-squares method (as suggested by this post)? - How to correct this behavior to obtain a test
Rsquaredof0for an intercept-only model (as many would expect), and still be able to correctly optimize aglmnetparameter search onRsquared?
Rsquaredcalculation used bycaretis explained in the linked thread. The question of whether the calculation in the ("suggested by") the linked thread is right is nonsensical. 1) If that's what this community has put forward in the past, why do you think this community will proffer a different answer now? 2) The answer shows the actual code that is being used, so it's pretty canonical. 3) The answer is from the author ofcaret. Your second question about how to get R / caret to provide a different Rsquared is off topic here. – gung - Reinstate Monica Jul 19 '17 at 18:48Rsquaredcalculated... as suggested by this post)?" The answer is clearly yes. Beyond that, this question appears to be off topic. – gung - Reinstate Monica Jul 19 '17 at 19:28NaNand NOT0.1. You can see it yourself by typingcor(rnorm(10), rep(42,10))^2in R (an example with10values and an intercept-only model with value42). – michael Jul 19 '17 at 19:31traditionalversion is used in that case. – gung - Reinstate Monica Jul 19 '17 at 19:39