2

I have two variables in a dataset. First variable has a continuous value (AHI) and other one is a binomial variable (OSA_status) which I have created based on the value of my first variable (AHI). i.e. if AHI> 5 then OSA_status = 1 else OSA_status = 0. I am trying out a logistic regression with dependent variable, OSA_status (0/1) and independent variable, AHI. I was expecting AHI will be significantly associated with OSA_status as it is derived from the independent variable itself (AHI). But my result is as follows. Can someone please explain me why i got this result.

Call:
glm(formula = OSA_status ~ AHI, family = "binomial", data = pre_surgery)

Deviance Residuals: Min 1Q Median 3Q Max
-5.277e-04 -2.000e-08 -2.000e-08 2.000e-08 5.818e-04

Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 517.7 36115.6 0.014 0.989 AHI -104.6 7314.3 -0.014 0.989

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 1.1829e+02  on 87  degrees of freedom

Residual deviance: 6.1704e-07 on 86 degrees of freedom AIC: 4

Number of Fisher Scoring iterations: 25

Warning messages: 1: glm.fit: algorithm did not converge 2: glm.fit: fitted probabilities numerically 0 or 1 occurred

arshad
  • 863
  • 1
  • 7
  • 13

2 Answers2

4

Your model seems to have perfect separation. Notice the large standard errors of the coefficient estimates, and the tell-tale warning messages:

Warning messages:
1: glm.fit: algorithm did not converge 
2: glm.fit: fitted probabilities numerically 0 or 1 occurred 

Your model never actually got fit by the software! As you provided a fixed cutoff of AHI > 5 for defining OSA_status, without any probabilistic element, this isn't very surprising. A logistic regression will have trouble with fitting when there actually is a cutoff that exactly distinguishes 2 groups.

For ways to deal with perfect separation, see for example this page and this page.

For the dangers of breaking a continuous predictor into discrete categories, see this page.

EdM
  • 92,183
  • 10
  • 92
  • 267
0

Are you sure that your OSA_status variable is coded as factor and the other AHI as numeric?

You could try the opposite and give a result? An Anova as following?

rstatix::anova_test(pre_surgery, AHI ~ OSA_status)

If you do not find any result here, it might be that you made a mistake when coding your OSA_status variable.