6

I have some categorical data set; I want to use these as predictor variables, like one is slope. And it categorized in to five classes as, < 10 deg, 10-20 deg, 20-30 deg, 30-40 deg, > 40 deg. I have taken first class as reference category (< 10 deg). Now I am facing problem to interpret the beta value of the reference category, because it’s not display on binary logistic regression. So what would be the value (beta), for reference category? Any suggestion regarding the problem would be appreciated.

SHRABAN
  • 61

2 Answers2

6

Setting

Let $X$ be the categorical predictor and suppose it has 3 levels ($X = 1$, $X = 2$, and $X = 3$). Let the third level be the reference category.

Define $X_1$ and $X_2$ as follows:

$$ X_1 = \left\{ \begin{array}{ll} 1 & \textrm{if } X = 1 \\ 0 & \textrm{otherwise;} \end{array} \right. $$

$$ X_2 = \left\{ \begin{array}{ll} 1 & \textrm{if } X = 2 \\ 0 & \textrm{otherwise.} \end{array} \right. $$

If you know both $X_1$ and $X_2$ then you know $X$. In particular, if $X_1 = 0$ and $X_2 = 0$ then $X = 3$.

Logistic regression model

The model is written $$ \log \left( \frac{\pi_i}{1 - \pi_i} \right) = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} $$ where $\pi_i$ denotes the probability of success of individual $i$ with covariate information $(x_{1i}, x_{2i})$.

  • If individual $i$ falls in category $1$ then $x_{1i} = 1$, $x_{2i} = 0$ and $\log \left( \frac{\pi_i}{1 - \pi_i} \right) = \beta_0 + \beta_1$.
  • If individual $i$ falls in category $2$ then $x_{1i} = 0$, $x_{2i} = 1$ and $\log \left( \frac{\pi_i}{1 - \pi_i} \right) = \beta_0 + \beta_2$.
  • If individual $i$ falls in category $3$ then $x_{1i} = 0$, $x_{2i} = 0$ and $\log \left( \frac{\pi_i}{1 - \pi_i} \right) = \beta_0$.

odds ratio

Odds ratios are computed with respect to the reference category. For example, for 'category 1 vs category 3' we have

$$ \frac{\exp(\beta_0 + \beta_1)}{\exp(\beta_0)} = \exp(\beta_1). $$

ocram
  • 21,851
2

this is standard for a single variable. the intercept is the log odds for the reference category and the dummy variables betas are the difference in log odds compared to the reference category. so an "insignificant" dummy variable means the logs odds arent significantly different from the reference category. this is the same as ordinary anova, just on the log odds scale instead of raw scale