How to see Simpson's paradox effect via glm in R

Question

I am using the UCBAdmissions dataset and glm from R to check for Simpson's paradox. I am aware that with data that I have, it is easier to visualise the paradox, but I am trying to inspect it via a Poisson/Log-linear model too.

My attempt

To visualise this is easy:

library(broom)
ucb_tidy <- tidy(UCBAdmissions) %>% group_by(Gender, Dept) %>% mutate(prop = n/sum(n))
ggplot(ucb_tidy, aes(x=Gender, y=prop, fill = Admit)) +
    ggtitle("Acceptance Rate Split by Gender & Department") +
    facet_wrap(~ Dept, nrow = 2) +
    guides(fill = guide_legend(reverse = TRUE)) +
    geom_bar(stat = "identity") +
    xlab("Gender") + ylab("% of Applicants") +
    scale_y_continuous(labels = percent_format())

But to check and show this with a Poisson model, I tried the following:

# Postulate that Gender and Admission are conditionally independent, given Dept.
cond_ind1 <- glm(n ~ (Gender + Admit) * Dept, family = poisson, data = ucb_tidy)
summary(cond_ind1)
Call:
glm(formula = n ~ (Gender + Admit) * Dept, family = poisson, 
    data = ucb_tidy)
Deviance Residuals: 
    Min       1Q   Median       3Q      Max

-3.4776  -0.4144   0.0098   0.3089   2.2321
Coefficients:
                    Estimate Std. Error z value Pr(>|z|)

(Intercept)          6.27557    0.04248 147.744  < 2e-16 ***
GenderFemale        -2.03325    0.10233 -19.870  < 2e-16 ***
AdmitRejected       -0.59346    0.06838  -8.679  < 2e-16 ***
DeptB               -0.40575    0.06770  -5.993 2.06e-09 ***
DeptC               -1.53939    0.08305 -18.536  < 2e-16 ***
DeptD               -1.32234    0.08159 -16.207  < 2e-16 ***
DeptE               -2.40277    0.11014 -21.816  < 2e-16 ***
DeptF               -3.09624    0.15756 -19.652  < 2e-16 ***
GenderFemale:DeptB  -1.07581    0.22860  -4.706 2.52e-06 ***
GenderFemale:DeptC   2.63462    0.12343  21.345  < 2e-16 ***
GenderFemale:DeptD   1.92709    0.12464  15.461  < 2e-16 ***
GenderFemale:DeptE   2.75479    0.13510  20.391  < 2e-16 ***
GenderFemale:DeptF   1.94356    0.12683  15.325  < 2e-16 ***
AdmitRejected:DeptB  0.05059    0.10968   0.461    0.645

AdmitRejected:DeptC  1.20915    0.09726  12.432  < 2e-16 ***
AdmitRejected:DeptD  1.25833    0.10152  12.395  < 2e-16 ***
AdmitRejected:DeptE  1.68296    0.11733  14.343  < 2e-16 ***
AdmitRejected:DeptF  3.26911    0.16707  19.567  < 2e-16 ***

Signif. codes:  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 2650.095  on 23  degrees of freedom

Residual deviance:   21.736  on  6  degrees of freedom
AIC: 216.8
Number of Fisher Scoring iterations: 4

This, however, is not a good fit if we use pchisq to check for goodness of fit, and does not directly show the effect of admission via departments, for sex.

I thought of another formulation:

# Postulate that Gender and Admission are jointly independent of Dept (or not)
joint_ind1 <- glm(n ~ Dept + (Gender*Admit), family = poisson, data = ucb_tidy)
summary(joint_ind1)
Call:
glm(formula = n ~ Dept + (Gender * Admit), family = poisson, 
    data = ucb_tidy)
Deviance Residuals: 
     Min        1Q    Median        3Q       Max

-19.7226   -7.8760    0.6475    7.1298   14.7146
Coefficients:
                           Estimate Std. Error z value Pr(>|z|)

(Intercept)                 5.50922    0.04106 134.188  < 2e-16 ***
DeptB                      -0.46679    0.05274  -8.851  < 2e-16 ***
DeptC                      -0.01621    0.04649  -0.349 0.727355

DeptD                      -0.16384    0.04832  -3.391 0.000696 ***
DeptE                      -0.46850    0.05276  -8.879  < 2e-16 ***
DeptF                      -0.26752    0.04972  -5.380 7.44e-08 ***
GenderFemale               -0.76584    0.05128 -14.933  < 2e-16 ***
AdmitRejected               0.22013    0.03879   5.675 1.38e-08 ***
GenderFemale:AdmitRejected  0.61035    0.06389   9.553  < 2e-16 ***

Signif. codes:  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 2650.1  on 23  degrees of freedom

Residual deviance: 2004.2  on 15  degrees of freedom
AIC: 2181.3
Number of Fisher Scoring iterations: 5

This has an even poorer fit, and does not have the logical appeal as the conditionally independent model.

My question is:

How can I study Simpson's paradox via the Poisson regression model? Would you recommend another model (e.g., binomial logistic regression)?

awhug · Accepted Answer · 2020-10-27T23:27:19.260

Here's a few ideas. This answer draws on a post by Norm Matloff, who fit log-linear models on the same dataset in R using loglin() to show how the paradox is related to the order in which the variables are considered.

First we'll load the packages and data. I'll also set up deviation contrasts to mirror the output we'd get from a traditional log-linear analysis (as in the post above), but using glm().

# Import Packages
library(tidyverse)
Import data
ucb_tidy <- broom::tidy(UCBAdmissions) %>% 
  mutate_at(c("Gender", "Dept", "Admit"), as.factor)
Specify Deviation Contrasts
contrasts(ucb_tidy$Admit) <- contrasts(ucb_tidy$Gender) <- contr.sum(2)
contrasts(ucb_tidy$Dept) <- contr.sum(6)

Poisson Regression

Now if you're interested in model fit, we can fit three nested models.

The first will include admission by gender terms, while ignoring the effects of department.
The second will also include department and department-admission interactions.
The third will do the same, but also include an interaction between gender and department.

We can then use likelihood ratio tests (using the lmtest package) to evaluate whether these additional department-interaction terms progressively improve our model fit.

library(lmtest)
Model Building Approach
m1 <- glm(n ~ Admit + Gender + Admit:Gender, 
          data = ucb_tidy,
          family = poisson)
m2a <- update(m1, . ~ . + Dept + Admit:Dept)
m2b <- update(m2a, . ~ . + Gender:Dept)
Test model fit using lmtest
lrtest(m1, m2a, m2b)

Incorporating the department as a predictor that also interacts with admissions (i.e. m2a) improves the model fit relative to the model ignoring the department (i.e. m1), $\chi^2=1014.8, p < .0001$. The model that additionally incorporates an interaction between gender and department (i.e. m2b) improves model fit over and above this, $\chi^2=1128.7, p < .0001$. We would therefore select m2b as our final model¹, although you could continue on with the three-way interaction if you're feeling so bold (although this model would be saturated).

The 'paradox' is probably easier to understand though when you plot the model predictions. I'll use emmeans for this, and put the predictions on the numbers admitted scale (rather than the default log scale). The patchwork package just joins the two plots at the end using +.

# Plot model predictions using emmeans and patchwork
library(emmeans)
library(patchwork)
m1_plot <- emmip(m1, Gender ~ Admit, 
                 CIs = TRUE, type = "response") + 
  labs(title = "m1 - No Department Term") + 
  theme_minimal() + ylim(c(50, 300)) +
  theme(legend.position = "none")
m2b_plot <- emmip(m2b, Gender ~ Admit, 
                  CIs = TRUE, type = "response") + 
  labs(title = "m2b - Department Interactions") +
  theme_minimal() + ylim(c(50, 300))
m1_plot + m2b_plot

While fewer women were applying overall in this UCB dataset (roughly half the number of men), the model that ignores which departments they were applying to would suggest the women were rejected at a higher rate relative to the men. The model that does include the department as a predictor though, and particularly the department-gender interaction, would suggest the opposite effect (when averaging over the departments).² You can check how close the model predictions were to the raw data using emmeans once more:

# Predictions
emmeans(m2b, ~ Dept:Admit + Dept:Gender, type = "response")
Observed
ucb_tidy

Logistic Regression

Finally, it is indeed possible to model the proportions of people admitted by department and gender using logistic regression (rather than the raw counts, as in the poisson regression above). You'd do that like so:

# Convert Data
ucb_logit <- spread(ucb_tidy, Admit, n)
Model Building Approach
ml1 <- glm(cbind(Admitted, Rejected) ~ Gender, 
           data = ucb_logit,
           family = binomial)
ml2a <- update(ml1, . ~ . + Dept)
ml2b <- update(ml2a, . ~ . + Gender:Dept)
Test Nested Models
lrtest(ml1, ml2a, ml2b)
Plot Probability of Acceptance by Dept & Gender
emmip(ml2b, Gender ~ Dept, type = "response", CI = TRUE)

This plot shows the model can replicate the observed proportions in the dataset fairly well.

^{¹ This model gives us identical parameter estimates to Norm's loglin() results.}
^{² Note that the scales are slightly different because including department in the model 2b weights the estimates differently.}

(+1) Nice illustration! P.S. Shouldn't this be broom::tidy instead of just tidy? — chl, Oct 27 '20 at 10:58
@awhug I am indebted to you for having educated me :)
I think in this use-case, it seems a logistic regression model is probably more intuitive as it lets us see difference in prob. of being admitted readily, while a poisson/log-lin model lets us see the difference only in terms of counts. Also perhaps because a log-lin model does not directly model relationship between dependent and independent variables (instead, assuming no such distinction exists) — info_seeker, Oct 27 '20 at 16:43
My pleasure @info_seeker. Yeah, the logistic model seems a little 'neater' to me in this case, just wanted to make sure to answer your original question about the poisson model. And thank you @chl -- I had assumed broom was loaded with the tidyverse but have just edited to be explicit now. — awhug, Oct 27 '20 at 23:34

How to see Simpson's paradox effect via glm in R

1 Answers1

Import data

Specify Deviation Contrasts

Poisson Regression

Model Building Approach

Test model fit using lmtest

Observed

Logistic Regression

Model Building Approach

Test Nested Models

Plot Probability of Acceptance by Dept & Gender

Linked