How to interpret binomial GLM coefficient w/ proportion as outcome

Question

If I am doing a binomial GLM with a proportion as an outcome--how do I interpret the model coefficients? I understand how it is done when the outcome is an event (e.g., 0/1), but it is less clear to me how to interpret it in this case. Is it still the log odds that the outcome (i.e., proportion) is 1?

A silly example below predicting population-level happiness based on number of televisions. My question is: how do I understand/interpret the coefficient for n_with_tv (0.003524932)?

set.seed(1)
mod_df = data.frame(
  island = 1:10,
  pop = sample(1000, 10)
)
mod_df$n_happy = round(mod_df$pop * sample(seq(.2, .8, .01), 10))
mod_df$prop_happy = mod_df$n_happy / mod_df$pop
mod_df$n_with_tv = round(mod_df$pop * (mod_df$prop_happy + rnorm(10, sd = .10)))
mod_df$prop_with_tv = mod_df$n_with_tv / mod_df$pop
mod_df
#>    island pop n_happy prop_happy n_with_tv prop_with_tv
#> 1       1 836     585  0.6997608       518    0.6196172
#> 2       2 679     353  0.5198822       275    0.4050074
#> 3       3 129      52  0.4031008        48    0.3720930
#> 4       4 930     725  0.7795699       697    0.7494624
#> 5       5 509     310  0.6090373       289    0.5677800
#> 6       6 471     344  0.7303609       356    0.7558386
#> 7       7 299     194  0.6488294       167    0.5585284
#> 8       8 270      78  0.2888889        90    0.3333333
#> 9       9 978     254  0.2597137       133    0.1359918
#> 10     10 187      52  0.2780749        48    0.2566845
mod_glm = glm(prop_happy ~ n_with_tv,
              weights = pop,
              family = "binomial",
              data = mod_df)
summary(mod_glm)$coef
#>                 Estimate   Std. Error   z value      Pr(>|z|)
#> (Intercept) -0.924823050 0.0551110306 -16.78109  3.356306e-63
#> n_with_tv    0.003524932 0.0001494939  23.57910 6.315129e-123

Possibly related:

score 3 · Accepted Answer · answered Aug 08 '23 at 15:35

This model is called the "fractional logit". You actually should be using a robust standard error and the quasibinomal family for this kind of analysis since the outcome does not have a binomial distribution.

The coefficient does not have a useful interpretation; it is essentially the same as with logistic regression on a binary event: a coefficient of $b$ means that a 1-unit change in $x$ is associated with a $b$-unit change in the logit of the outcome, where $\text{logit}(y) = \log\left(\frac{y}{1-y}\right)$. We can't speak about probabilities or odds and instead can only interpret the logit as a complicated nonlinear function of the outcome.

For this reason, fractional logit models are often interpreted using marginal effects. The Stata documentation of fracreg explains these models and describes how one would use the margins command after fitting the model to appropriately interpret the model results. When the outcome is a probability, it might be possible to use a log-odds interpretation. In your case, you might be able to interpret the coefficient as the change in the log odds of an individual on an island being happy given a change in the number of TVs on that island. But this interpretation extrapolates a bit from what the model allows, and the marginal effects approach would be preferred.

In R, you can use marginaleffects::avg_slopes() to compute the average marginal effect of the predictor, which can be interpreted as the average rate of change in the outcome corresponding to a change in the predictor (or the average of pointwise derivatives of the average dose-response function across the sample). Se my answer here for more intuition on interpreting these quantities.

Very helpful--thank you, Noah! – Andrew Oct 03 '23 at 14:14 — Andrew, Oct 03 '23 at 14:14

How to interpret binomial GLM coefficient w/ proportion as outcome

1 Answers1