3

I have a dataset and I want to do a logistic regression between the continuous variable "A" and the categorical variable "B". However, I also wanted to include "age" and "sex" variables as confounders in my statistical analysis. Could you please explain how can I do this in R? And also, how can I obtain the adjusted odds ratio for each variable while accounting for the effects of the covariates?

Is this a correct statistical analysis? should I also include any interaction between these variables?

model <- glm(B ~ A + Age + Sex , data = Data, family = binomial())

Afterwards, how can I interpret it correctly?

Thanks

1 Answers1

9

I have a dataset and I want to do a logistic regression between the continuous variable "A" and the categorical variable "B". However, I also wanted to include "age" and "sex" variables as confounders in my statistical analysis.

Your model does what you want it to do. By including Age and Sex, you're adjusting for their influence. However, consider including the continuous predictors nonlinearly as the assumption of linear relationships is a strong one. My personal preference are natural splines. In R, you could do it like this:

library(splines)

model <- glm(B ~ ns(A, df = 4) + ns(Age, df = 4) + Sex , data = Data, family = binomial())

This would create a natural spline for A and Age with $3$ internal knots. More on that here. I recommend looking into the rms package and its function rcs, which places the knots in a more principled way. The package has been developed by Frank Harrell and I strongly recommend looking into his book for more information about the rms package and modelling in general.

And also, how can I obtain the adjusted odds ratio for each variable while accounting for the effects of the covariates?

Exponentiate the coefficients and corresponding confidence limits to get odds ratios and their CIs. For nonlinear effects, use plots to show the relationship (e.g. with ggeffects).

should I also include any interaction between these variables?

That's for you to decide before the analysis. The decision could be informed by the literature, expert knowledge and a the sample size.

Afterwards, how can I interpret it correctly?

Without any output or indication what seems to be the problem, it's difficult to give precise advice. Here is a good tutorial on logistic regression and its interpretation.

COOLSerdash
  • 30,198
  • 2
    And see the R rms package for simple code that automates much of the process, e.g., giving estimates of odds ratios for automatically chosen ranges of covariate values such as quartiles and making splines easier to handle. Many examples are at https://hbiostat.org/rmsc – Frank Harrell Jan 26 '24 at 14:09
  • @COOLSerdash many thanks. Your explanation is so helpful. Do you mean that only by including the predictor and confounding variables simultaneously in the model, the model adjusts the odds ratio of any given variable of interest by controlling the effects of other variables? – Erfan Naghavi Jan 26 '24 at 19:22
  • 1
    @ErfanNaghavi Yes, the effect of each variable in the model is adjusted for all other included variables. So the effect of A is adjusted for Age and Sex. – COOLSerdash Jan 26 '24 at 21:45