2

I have a logistic regression model that trains a set of binary independent variables (X) on a binary response variable (Y).

The data was gathered from different individuals for who also e.g. socio-demographic data exists.

I can therefore split my X and Y into subsets (e.g. by gender) and train the same model for the individual subgroups.

I get for the individual groups different independent variable coefficients.

Say e.g. I have for group 1 and x1 the coefficient β1=0.123 and for group 2 and x1 the coefficient β1=-0.456.

How can I use the coefficients to compare the two groups with respect to the independent variable x1? What else does me the coefficients tell? How can I show statistically significant differences between the coefficients (and the independent variables)?

Dave
  • 62,186
ltsstar
  • 121

1 Answers1

1

I would do this in one regression that includes gender as a variable and an interaction between gender ($G$) and your $X$ of interest. Then you test the coefficient on the interaction, which measures the difference in $X$ between genders.

$$ logit(\mathbb E[Y\vert X, G]) = \beta_0 +\beta_1X +\beta_2G + \beta_3 XG $$

This is easy to do in software such as R.

set.seed(2022)
N <- 100
X <- runif(N)
G <- sample(c("male", "female"), N, replace = TRUE)
y <- rbinom(N, 1, 0.5)
L <- glm(y ~ X*G, family = binomial)
summary(L)

I get a p-value on the interaction term of $0.675$. Since the simulated $y$ is independent of my simulated $X$ and $G$, this is not surprising.

For extensions of this idea, you might find yourself interested in chunk tests. For instance, if you wanted to consider the G variable to have a third nonbinary level, you could do the following.

library(lmtest)
set.seed(2022)
N <- 100
X <- runif(N)
G <- sample(c("male", "female", "nonbinary"), N, replace = TRUE)
y <- rbinom(N, 1, 0.5)
L0 <- glm(y ~ X + G, family = binomial)
L1 <- glm(y ~ X*G, family = binomial)
lmtest::lrtest(L1, L0)

This is analogous to a partial F-test in linear regression, which is one example of a chunk test.

Dave
  • 62,186