I'm trying to understand partial R-squared as computed by package rsq. See the reproducible example on the standard data set:
require(rsq)
data(esoph)
model1 <- glm(cbind(ncases, ncontrols) ~ agegp + tobgp * alcgp,
data = esoph, family = binomial)
attach(esoph)
rsq(model1)
# [1] 0.826124
rsq.partial(model1)
$variable
#[1] "agegp" "tobgp" "alcgp" "tobgp:alcgp"
#
#$partial.rsq
#[1] 0.6548246836256900182960 -0.0000000000000006661338 0.0000000000000000000000 0.1456091052940046148834
rsq.partial(model1, adj = TRUE)
#$adjustment
#[1] TRUE
#
#$variable
#[1] "agegp" "tobgp" "alcgp" "tobgp:alcgp"
#
#$partial.rsq
#[1] 0.6290653316574579267950 -0.0000000000000006661338 0.0000000000000000000000 0.0308401791394680158120
detach(esoph)
Here, the partial R-squared for the first variable agegp seems too high. Just look at the deviance anova table:
anova(model1)
# Df Deviance Resid. Df Resid. Dev
# NULL 87 227.241
# agegp 5 88.128 82 139.112
# tobgp 3 19.085 79 120.028
# alcgp 3 66.054 76 53.973
# tobgp:alcgp 9 6.489 67 47.484
cat("pseudo R2 agegp: ", 1-139.112/model1$null.deviance, "\n")
pseudo R2 agegp: 0.3878206
So from the deviance table, the pseudo R-squared corresponding to the agegp variable is 0.3878. This also corresponds to rsq() and rsq.partial called on a model with only agegp variable:
model1 <- glm(cbind(ncases, ncontrols) ~ agegp, data = esoph, family = binomial)
attach(esoph)
rsq(model1)
# [1] 0.3791795
rsq.partial(model1)
#$adjustment
#[1] FALSE
#
#$variable
#[1] "agegp"
#
#$partial.rsq
#[1] 0.3791795
detach(esoph)
So why is the partial R-squared coefficient claimed to be 0.655 (or 0.629 adjusted)?
agegpis 0.379 (last two commands). Theagegpraises the rsq fromrsq(fit.x2x3)= 0.496 up torsq(fit.x1x2x3)= 0.826, so can the value of adding it be 0.65? I must miss something key here... – Tomas Nov 03 '19 at 19:20agegpis ~35%. The confusion is about the meaning of the ~65% -- it is the variability whichagegpexplains after adjusting for the other two variables. So there is no contradiction. – dipetkov Nov 03 '19 at 21:38(rsq(fit.x1x2x3) - rsq(fit.x2x3))/(1 - rsq(fit.x2x3))which corresponds to the definition of1 - deviance(fit.x1x2x3)/deviance(fit.x2x3)(just to illustrate principle, different measure of variance was used byrsq). This can be converted to%incMSEfrom randomForest and also to partial correlation coefficient. Thanks for help! – Tomas Nov 03 '19 at 22:20%incMSEsince%incMSEis computed by permuting, i.e. "dropping" each variable, in turn. – dipetkov Nov 03 '19 at 22:39