In a discussion, one claimed that because gender has only two categories, he can correlate it with a continuous variable. Is it acceptable to use Pearson correlation between one continuous and another binary variable?
1 Answers
Sex is a nominal variable: There is no origin, no ordering, etc. However, most dichotomous nominal variables can be treated as dichotomous continuous variables. And when you do that, the cases degenerate to the same result.
Consider a simple regression in your case of two sexes. Code males as zero and females as one.
Because the regression is asked to fit two parameters and the "intercept" will be the male mean and the "slope" is going to be the difference between the female mean and the male mean (intercept).
The $F$-test for this regression will degenerate to the $t$-test for the two groups (literally, $F=t^2$). The $F$-test would also test for slope=0 or correlation=0 because all these tests degenerate to the same test under these conditions.
By the way, the coding does not matter in this simple case either. You could code males=-1 and females=+1 and the answers will be the same.
When you get more than two cases, though, where you place the third and additional points matters and it is not longer degenerative.
Does this help?
- 719
Also, it's questionable if there are truly precisely two genders (or sexes). – jona Jan 15 '16 at 19:21