No doubt this is a stupid question but I can't seem to find help anywhere online.
I want to do a logistic regression with 2 independent variables.
Ideally, I would like to see how each variable compares to the mean but for some reason, in R, I'm not able to do this. Whatever I try, I always get comparison with at least one of the explanatory variables. See the example and output below:
set.seed(123)
group1 <- rep(c("A", "B", "C"), times = 100)
group2 <- rep(c("D", "E", "F"), each = 100)
dat <- rbinom(300, 1, 0.5)
model <- glm(dat ~ group1 + group2 + 0, family = binomial)
summary(model)
#Coefficients:
Estimate Std. Error z value Pr(>|z|)
#group1A 0.14471 0.25837 0.560 0.575
#group1B -0.17756 0.25991 -0.683 0.495
#group1C -0.33858 0.26139 -1.295 0.195
#group2E 0.12461 0.28455 0.438 0.661
#group2F 0.04537 0.28466 0.159 0.873
If I did something similar with a linear regression, I would get an intercept which would be the mean of all the data, and then the other variables would show how that differs from the mean. However here, I am pretty sure that all the variables are showing me how they differ from the mean at group D.
Surely it's not a matter of having to estimate too many variables. There are 300 data points. All I would like to know is the overall mean (log odds ratio) and how groups A-F differ from that mean. What am I missing?
EDIT: I went with a workaround that is a bit subpar. In a loop, I construct an indicator variable that is 1 if it's the group (i.e. group D) that I'm interested in and 0 otherwise. Then I loop through the groups (about 100 for the problem I have) constructing a model each time with the indicator variable as an explanatory variable as well as the other variables I wanted to account for. It takes a little bit of time to load but it's not completely infeasible.
Do let me know if there is another workaround where I could just put the raw data in to begin with and have everyone compared to the overall mean. A worked example would be nice.
constr.sum. You can also compute contrasts post-hoc to make the comparisons you want rather than parameterize the model so that coefficients correspond to the quantities of interest. – Noah Jan 04 '23 at 19:14