0

I am running a logistic regression on a dataframe that contains numeric, binary and factor variables. The df is similar to this:

df <- data.frame(
"Industry"= 
c("A","A","A","A","A","A","A","B","B","B","B","B","B","B","C","C","C","C","C","C","C"),  
"Revenue"=rnorm(21),
"CEOCHAIRMAN"= c(TRUE,FALSE,TRUE,FALSE,TRUE,FALSE,TRUE,FALSE,TRUE,FALSE,TRUE,FALSE,TRUE,FALSE,TRUE,FALSE,TRUE,FALSE,TRUE,FALSE,TRUE),
"Size"=rnorm(21))

Nevertheless, when I run my logistic regression, only 2 of the 3 factor levels are in the output:

mod1<-glm(df$CEOCHAIRMAN ~ 
        + factor(df$Industry)
      + df$Revenue
      + df$Size,
      family=binomial, maxit=100)

Call:  glm(formula = df$CEOCHAIRMAN ~ +factor(df$Industry) + df$Revenue + 
df$Size, family = binomial, maxit = 100)

Coefficients:
     (Intercept)  factor(df$Industry)B  factor(df$Industry)C            df$Revenue  
         0.25228              -0.66707              -0.02695              -0.04896  
         df$Size  
         0.20427  

Degrees of Freedom: 20 Total (i.e. Null);  16 Residual
Null Deviance:      29.06 
Residual Deviance: 28.39    AIC: 38.39

How can I include all the levels?

Ludovico
  • 37
  • 5
  • 1
    The levels *are* all included. The Intercept gives you the value for factor level "A", and the coefficients for "B" and "C" are relative to this. That is the standard for reporting regressions with factor levels in R. If you really want to see the raw coefficients, you can add "+ 0" to your formula. Note that you should use the `data` parameter in the call to `glm` so that your formula doesn't need to include `df$` before each variable name. – Allan Cameron Sep 27 '21 at 13:34

0 Answers0