Plotting results of ordered logistic regression analysis

Question

I tried to plot the results of an ordered logistic regression analysis by calculating the probabilities of endorsing every answer category of the dependent variable (6-point Likert scale, ranging from "1" to "6"). However, I've received strange probabilities when I calculated the probabilities based on this formula: $\rm{Pr}(y_i \le k|X_i) = \rm{logit}^{-1}(X_i\beta)$.

Below you see how exactly I tried to calculate the probabilities and plot the results of the ordered logistic regression model (m2) that I fitted using the polr function (MASS package). The probabilities (probLALR) that I calculated and used to plot an "expected mean score" are puzzling as the expcected mean score in the plot increases along the RIV.st continuum while the coefficient for RIV.st is negative (-0.1636). I would have expected that the expected mean score decreases due to the negative main effect of RIV.st and the irrelevance of the interaction terms for the low admiration and low rivalry condition (LALR) of the current 2 by 2 design (first factor = f.adm; second factor = f.riv; dummy coding 0 and 1).

Any idea of how to make sense of the found pattern? Is this the right way to calculate the probabilities? The way I used the intercepts in the formula to calculate the probabilities might be problematic (cf., Negative coefficient in ordered logistic regression).

m2 <- polr(short.f ~ 1 + f.adm*f.riv + f.adm*RIV.st + f.riv*RIV.st, data=sampleNS)

# f.adm  = dummy (first factor of 2 by 2 design);
# f.riv  = dummy (second factor of 2 by 2 design);
# RIV.st = continuous predictor (standardized)
summary(m2)
Coefficients:
                Value Std. Error t value
f.adm1         1.0203    0.14959  6.8203
f.riv1        -0.8611    0.14535 -5.9240
RIV.st        -0.1636    0.09398 -1.7403
f.adm1:f.riv1 -1.2793    0.20759 -6.1625
f.adm1:RIV.st  0.0390    0.10584  0.3685
f.riv1:RIV.st  0.6989    0.10759  6.4953

Intercepts:
    Value    Std. Error t value 
1|2  -2.6563   0.1389   -19.1278
2|3  -1.2139   0.1136   -10.6898
3|4  -0.3598   0.1069    -3.3660
4|5   0.9861   0.1121     8.7967
5|6   3.1997   0.1720    18.6008

Here you see how I tried to calculate the probabilities (probLALR) for 1 of the 4 conditions of the 2 by 2 design:

inv.logit  <- function(x){ return(exp(x)/(1+exp(x))) }
Pred       <- seq(-3, 3, by=0.01)
b = c(-2.6563,-1.2139,-0.3598,0.9861,3.1997) # intercepts of model m2
a = c(1.0203,-0.8611,-0.1636,-1.2793,0.0390,0.6989) # coefficients of m2
probLALR   <- data.frame(matrix(NA,601,5))
for (k in 1:5){ 
    probLALR[,k] <- inv.logit(b[k] + a[1]*0 + a[2]*0 + 
                               a[3]*Pred  + a[4]*0*0 + 
                               a[5]*Pred*0 + a[6]*Pred*0)
}

plot(Pred,probLALR[,1],type="l",ylim=c(0,1))   # p(1)
lines(Pred,probLALR[,2],col="red")             # p(1 or 2)
lines(Pred,probLALR[,3],col="green")           # P(1 or 2 or 3)
lines(Pred,probLALR[,4],col="orange")          # P(1 or 2 or 3 or 4)
lines(Pred,probLALR[,5],col="orange")          # P(1 or 2 or 3 or 4 or 5)

# option response functions:

orc = matrix(NA,601,6)
orc[,6] = 1-probLALR[,5]        # prob of 6
orc[,5]= probLALR[,5]-probLALR[,4]  # prob of 5
orc[,4]= probLALR[,4]-probLALR[,3]  # prob of 4
orc[,3]= probLALR[,3]-probLALR[,2]  # prob of 3
orc[,2]= probLALR[,2]-probLALR[,1]  # prob of 2
orc[,1]= probLALR[,1]           # prob of 1


plot(Pred,orc[,1],type="l",ylim=c(0,1))   # p(1)
lines(Pred,orc[,2],col="red")             # p(2)
lines(Pred,orc[,3],col="green")           # P(3)
lines(Pred,orc[,4],col="orange")          # P(4)
lines(Pred,orc[,5],col="purple")          # P(5)
lines(Pred,orc[,6],col="purple")          # P(6)

# mean score

mean = orc[,1]*1+orc[,2]*2+orc[,3]*3+orc[,4]*4+orc[,5]*5+orc[,6]*6
plot(Pred,mean,type="l",xlab="RIV.st",ylab="expected mean score",ylim=c(1,6))

That still isn't reproducible. In the meantime, let me just point out that predict(m2, type ="prob") computes the probabilities for you. Also package 'effects' can be used for effect displays of models fitted by polr(). — Achim Zeileis, Mar 15 '15 at 20:13
Dear Achim, thanks a lot for the hint to the predict function. I didn't know that you can predict probabilities. That solves the whole issue. — Michael Grosz, Mar 16 '15 at 01:14
@AchimZeileis: I've now updated my question and code. The results are still puzzling when I calculate the prob by hand because the expected mean scure curve is increasing even though it should be decreasing. The results are different when I use the predict(m2,type="prob) (see my answer below). @Aniko — Michael Grosz, Mar 17 '15 at 15:29

score 1 · Accepted Answer · edited Mar 17 '15 at 17:17

Thanks to Wilco Emons for the following solution to the problem:

In polr the cumulative link model is parameterized a bit different than in Agresti’s book that is referred to. The problem can be easily solved by changing the code line:

probLALR[,k] <- inv.logit(b[k] +  a[1]*0 + a[2]*0 + a[3]*Pred + a[4]*0*0 + 
                                  a[5]*Pred*0 + a[6]*Pred*0                )

into

probLALR[,k] <- inv.logit(b[k] - (a[1]*0 + a[2]*0 + a[3]*Pred + a[4]*0*0 + 
                                  a[5]*Pred*0 + a[6]*Pred*0)               )

Thanks also to Achim Zeileis for his suggestion to use predict(m2, type="prob")! Below you will find a way to calculate the probabilities by the means of the predict() function:

Pred             <- seq(-3, 3, by=0.01)
PRED.LALR        <- data.frame(rep(NA,601))
PRED.LALR$f.adm  <- as.factor(rep(0,601))
PRED.LALR$f.riv  <- as.factor(rep(0,601))
PRED.LALR$RIV.st <- Pred
prob.LALR        <- predict(m2,PRED.LALR,type="prob")

scoreLALR        <- prob.LALR[,1]*1 + prob.LALR[,2]*2 + prob.LALR[,3]*3 + 
                    prob.LALR[,4]*4 + prob.LALR[,5]*5 + prob.LALR[,6]*6
plot(Pred, scoreLALR, col="green", ylim=c(1,6))

Plotting results of ordered logistic regression analysis

1 Answers1