How to Combine Correlated Binary Variables

Question

I have a count data, and most of my explanatory variables are binary (they represent different types of support given to a group). Three variables seem to be correlated with each other. I don't want to drop any of them as the categories are different and essential. So, I thought maybe I could create another variable combining those variables such as creating a dummy (1), where at least one of the variables is (1) or creating a variable that represents the sum of the values. However, I am not sure about alternatives or the right way to handle this. Could anyone please help me with that?

can you explain what you are trying to do with these variables? — George Savva, Oct 10 '23 at 08:42
So the unit of observation is the group or the individual within the group? — George Savva, Oct 10 '23 at 08:54

score 2 · Answer 1 · answered Oct 10 '23 at 09:56

There are several possibilities, which you choose depends on how much data you have, how many of the dichotomous IVs you have, and their pattern.

If you have a relatively large amount of data and a relatively small number of variables, you can make one variable that has all the originals. E.g. if you had three dichotomous variables, you would have a new one that was

YYY
YYN
YNY
YNN

and so on, with 8 levels.

If you have less data or more variables, you could do a count. So, if you had five kinds of support, this would get a value from 0 to 5.

More complicated is to look at the patterns and make some sensible choices from within the first idea. You could (and maybe should) do this using substantive knowledge and eyeballing the data, or you could perhaps use some kind of cluster analysis. See this thread for some ideas.

How to Combine Correlated Binary Variables

1 Answers1