2

What is the reasoning behind recoding ordinal independent variables in a binary logistic regression?

Example: imagine a researcher is intrested in the effects of some political attitudes on turnout {0;1}. He or she would draw on survey data and the key independent variable of interest would by a typical Likert-scale variable like this one ('internal efficacy'):

Normal people like me cannot make a difference in politics.
1 - Strongly agree
2 - Somewhat agree
3 - Neither agree nor disagree
4 - Somewhat disagree
5 - Strongly disagree

Suppose, the researcher would then recode survey responses into a dichotomous independent variable by combining 1 & 2 into 1 and assigning 0 to everything else. This would of course reduce information (i.e., somewhat agree is treated the same as strongly agree), but would simplify the interpretation in the sense that a respondent can be thought of either having high internal efficacy, or not.

Here are two examples of analyses of voter turnout that actually recode independent variables that way or at least following similar lines of thinking; one of which is an article by myself.

Habersack, F., Heinisch, R., Jansesberger, V., & Mühlböck, A. (2021). Perceived Deprivation and Voter Turnout in Austria: Do Views on Social Inequality Moderate the Deprivation—Abstention Nexus?. Political Studies, Link.

Mahlangu, T., & Schulz-Herzenberg, C. (2022). The Influence of Political Efficacy on Voter Turnout in South Africa. Politikon, 1-17, Link.

I wonder if there is any other reasoning behind transforming independent variables in logistic regression like this, meaning other than just simplifying the interpretation of survey responses.

EDIT (1): I'm aware of @Tom's question from 2013 (What is the benefit of breaking up a continuous predictor variable?), but in contrast to his post, I'm exclusively interested in binary logistic models and the practice of dichotomizing ordinal variables (because it seems to be related to logistic regression models and the nature of the outcome variable).

EDIT (2): I have edited my initial question to clarify that I am interested only in the dichotomization of independent variables in logistic regression, I am not interested in dependent variables and the approriate choice of logistic regression (binary, ordinal...).

  • "does it have to do with the link function and the fact that the outcome variable is also dichotomous" no, the independent variables should be numerical, just like in linear regression – user357269 Oct 16 '19 at 20:53
  • 2
    This seems to be different from @Stephan suggested duplicate because it asks about categorical predictors & only with respect to logistic regression. I didn't see that addressed in the comments & answers to that post. – Michael R. Chernick Oct 16 '19 at 21:48
  • Can you refer or link to some papers where this is done? – kjetil b halvorsen Oct 16 '19 at 21:51
  • I agree with @MichaelChernick. The question by Tom from five years ago goes into a similar direction, but I was mainly interested in the reasoning behind dichotomizing ordinal, categorical variables in binary logistic regression models... – Dr. Fabian Habersack Oct 17 '19 at 12:30
  • @kjetilbhalvorsen I can't really point to one specific article. I mean, it's done for instance in this book (Ch. 4), but I guess there are numerous other studies doing this:

    — Morales, L., & Giugni, M. (Eds.). (2016). Social capital, political participation and migration in Europe: making multicultural democracy work?. Springer.

    – Dr. Fabian Habersack Oct 17 '19 at 12:50
  • ...I was just wondering whether dichotomizing ordinal variables is done because the outcome variable is, too (becaus I'm not sure this would do justice to how the model is estimated, to the link function and the non-linear relationship) — or would the reasoning be that you cannot treat 5-point ordinal variables as metric and therefore you dichotomize them...?! – Dr. Fabian Habersack Oct 17 '19 at 12:55

0 Answers0